More intelligent std::async decision to make new thread

Tue Mar 14 13:36:00 GMT 2017

On 14/03/17 10:47 +0100, Ko Stoffelen wrote:
>Hi,
>
>I have a bunch of tasks that I'd like to execute in parallel as fast as
>possible. I'm using std::async for this purpose, without a launch
>policy, which is then std::launch::async | std::launch::deferred by
>default. Now if I understand correctly, the C++ standard leaves it to
>the implementation how it should handle this launch policy. libstdc++
>always first tries to create a new thread and only if that raises an
>exception, the task is deferred.
>
>https://gcc.gnu.org/onlinedocs/gcc-6.1.0/libstdc++/api/a01298_source.html#l01716

That's the case since GCC 6, for earlier releases the default was to
defer. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51617

But people who should have known better kept complaining to me that
it was non-conforming.

>In practice, this could imply that I'm running >100 threads on 4 cores
>and the OS is only switching tasks all the time, actually delaying the
>computation. Is it possible to make a more intelligent decision at
>runtime, based on, e.g., the number of cores or the current CPU load?

I tried something smarter, using the getloadavg(3) function and
comparing the load average to the std::thread::hardware_concurrency()
value, borrowing the code from GNU Make that decides whether to spawn
a new job or not. But I think I hit some bug in glibc or the kernel
when calling hardware_concurrency() very rapidly and didn't have time
to analyse that or work around it.  I got sick of being told our
implementation was non-conforming and just flipped the default to
launch::async, and people stopped complaining.

Ideally we'd do something smarter, but we have limited resources and
it's not been a high priority issue to work on.

I'd like the implementation to actually use three states, launched,
deferred, and undecided. Calling std::async would hand the task to an
executor which would set the state to undecided and add it to a queue.
The executor would have a background thread (or equivalent) that
periodically checks if the system load permits launching a new thread,
and if so would set the state to launched and remove it from the
queue. If the user calls a waiting function on a queued task set its
state to deferred and remove it from the queue.

(This approach is inspired by discussions with Anthony Williams, who
does something like this for his http://www.stdthread.co.uk/ library).

There are currently proposals to standardize executors for C++, but
they're still in flux so again, designing our own just for std::async
hasn't been a priority.