This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Cygserver 100% CPU (was: References to both cygwin1.dll and msvcrt.dl


--- Corinna Vinschen wrote:
> On Oct  1 07:24, Patrick Samson wrote:
> > 
> > --- Corinna Vinschen wrote:
> > 
> > > On Sep 30 23:41, Patrick Samson wrote:
> > > > Now, when it's wrong, I can see:
> > > >   good morning (error=4)!
> > > > Error 4 is EINTR on the return of msleep().
> > > > Subsequently semop() returns with this EINTR.
> > > 
> > > Are you set up to build cygwin?  If so, could
> you
> > > please test the
> > > following patch to cygserver and if it changes
> > > anything for you?
> > 
> > Same behaviour.
> > As soon as there are some error=4, it will hang.
> > On service stop, postgres may stop some of its
> > backends, but not all of them, and stay in
> > 'Stopping' state.
> 
> I'm still hoping for a simple testcase...
> 

I'm still working on it (the problem, not the
testcase, as it is probably a race condition).

I'm looking at traces from cygserver.log.
I found something strange.
We may not focus too much on error=4. For each
EINTR the semop() is called again and all is
back to sleep.
Even before the first error=4, the last op=1 is
supposed to wake up all sem[] of the semid. But
I could see that:
sem 1 is never mentioned anywhere (?, but why not)
sem 0, 10 are not in sleeping state, OK
sem 2,3,4,5,6,7,8,9,10,12,14,15 are waked up, and
    set back to sleep, OK
sem 11 and 13 ARE MISSING !

Why aren't 11 and 13 waked up? Because there were
not put in a good state of sleeping? Hum...

Look at the attached extract.
You will see that inside the semop() for sem[11],
the trace output is suspended and during that time
3 other semop() arrived, and even 1 more just after.
So 4 calls are waiting for mutex semid at the same
time.
Operations for sem 9, 14, 2, 15 seem to be OK as
they are well bound between Locked/Unlocked.
But NOT for 13, for which the Unlocked seems to
appear after the end of sem 15.
The first Unlocked should be the one for putting
sem 11 to sleep, but many things may have arrived
before to reach the WaitForMultipleObjects().
So I suspect that something is corrupted in regard
with the Event to wait for.
Note: when everything was stopped, sem 11 was able
to return from sleeping with EINTR. No trace for
sem 13. But reminder that some processes, but all,
were able to stop correctly.



		
_______________________________
Do you Yahoo!?
Declare Yourself - Register online to vote today!
http://vote.yahoo.com

Attachment: cygserver9.log
Description: cygserver9.log

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]