long I/O delays when strace is running

Mark Geisert mark@maxrnd.com
Fri Apr 21 12:08:00 GMT 2017

Daniel Santos wrote:
> I've tracked it down to this little Sleep() loop in pinfo::init.
>        bool created = shloc != SH_JUSTOPEN;
>        /* Detect situation where a transitional memory block is being retrieved.
>       If the block has been allocated with PINFO_REDIR_SIZE but not yet
>       updated with a PID_EXECED state then we'll retry. */
>        if (!created && !(flag & PID_NEW))
>      /* If not populated, wait 2 seconds for procinfo to become populated.
>         Would like to wait with finer granularity but that is not easily
>         doable.  */
>      for (int i = 0; i < 200 && !procinfo->ppid; i++)
>        Sleep (10);
> I tried putting a stupid memory barrier in the loop and a volatile read just for
> kicks, but that doesn't seem to be the problem.  I'm headed off to bed.  This
> only happens when using strace, so if anybody has ideas please post.

I can reproduce your issue on a real Win7.64 machine so that removes any 
possible virtual machine root cause.  I was running 'top -s1' in one window 
while running your testcase in another window.  Yes, top froze for many seconds 
at a time, then caught its display up, only to freeze again repeatedly.  It was 
still frozen for a while after your testcase had ended (!), then caught up. 
Your mention of pinfo::init and 'ps' along with my usage of 'top' leads me to 
think this may be somehow related to the /proc filesystem.

Here's my humble contribution to the discussion:

~ time w
  02:15:52 up 3 days, 20:34,  0 users,  load average: 0.99, 0.62, 0.31

real    0m0.203s    <-- OK, nice and fast
user    0m0.077s
sys     0m0.139s

~ time strace -o w.out w
  02:16:23 up 3 days, 20:34,  0 users,  load average: 0.54, 0.55, 0.29

real    0m28.487s   <-- but stracing it is much, much slower
user    0m0.015s
sys     0m0.000s

The 'w' command is normally pretty fast.  Running it under strace makes it take 
an unreasonably long time.  Something seems busted somewhere.  The strace output 
for this example has many occurrences of ~3.1-second delays that seem to occur 
as w is accumulating process time information for all processes.


Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

More information about the Cygwin mailing list