fork errors - search snips from cygwin mailing list archive

Tom Rodman cygwin@trodman.com
Tue Sep 27 12:45:00 GMT 2005


I egreped thru (my local copy of) the cygwin archives back to ~3/2005 for:

  fork: No such file or directory
  died waiting for longjmp
  fork: Bad file
  fork: Resource temporarily unavai

Please see end of this post for snips from this grep. The
results suggest a fair number of users had the problem, I'd be
curious how many convinced themselves their problems are gone; 
can we identify which of the cases are solved?

We're still seeing these non repeatable errors (now using the 9/22
snapshot on a development box). No test case is enclosed. 

A fairly recent cygcheck.out for the computer having the problems is an
attachment in:

  http://sourceware.org/ml/cygwin/2005-09/msg00796.html

If you can show me our box is out of resources, that would be great -
see the attachment for RAM, a list of processes, and more- it's a Compaq
ProLiant DL380 G3. It is behind a firewall in a datacenter; anti-virus
scanning is *not* real time, but scheduled weekly. The attachment is output
from:

  msinfo32 /report FILENAMEHERE /catagory systemsummary

-------------- next part --------------
A non-text attachment was scrubbed...
Name: msinfo32.out.gz
Type: application/octet-stream
Size: 28521 bytes
Desc: msinfo32 /report FILENAMEHERE /catagory systemsummary
URL: <http://cygwin.com/pipermail/cygwin/attachments/20050927/a11887ac/attachment.obj>
-------------- next part --------------
For us, in general the fork errors do not show up until quite a few cygwin
bash and perl scripts have already run ok. Once it happens, it may be
fairly repeatable, until I stop all cygwin services, kill all cygwin
processes; then the fork errors are gone for a while again.  

I do most interactive CLI work in ssh bash login sessions. I see fewer
of these fork errors in 1.5.18, and in the Sept snapshots. 

Friday 9/23, I was debugging a fairly simple bash shell script, with
a few traps, and functions - I had several vim sessions suspended,
and 2 ssh sessions. Here are some fork errors from Friday (using snaphot:
1.5.19s(0.138/4/2) 20050922):

  $ ci -l a_small_text_file
  -bash: fork: Resource temporarily unavailable
  $

  <snip>
  $ lt
  -bash: fork: Resource temporarily unavailable
  <snip>
  $ which lt
  lt is aliased to `cmd /c dir /od|d2u|egrep -v "<DIR> +RCS$|\.\.?$| [0-9]+ (Dir|File)"|tail -8'
  <snip>

  $ ./a_mediumsized_bashscript
  /adm/bin/win/service_restart02.shinc: fork: Resource temporarily unavailable
       10 [main] ? 0 fork_copy: cygheap for exec pass 0 failed, 0x6115A900..0x61160834, done 0, windows pid 2816, Win32 error 5

  # Another problem on Friday: an interactive ssh session for no apparent reason
  # just died.  This has happened in the past, see:
  #   http://sourceware.org/ml/cygwin/2005-07/msg01006.html
  #
  # The good news is this happens less than in 1.5.17.
  # For more detail on this see last ~30 lines of this post.

We continue to use 1.3.20 and it is just about problem free, it "just
works" - I look forward to that level of stability. 
Oh well.. it's our end user perspective. New features and speed are great,
but stabiility and reliablity get's the job done over and over. I've
been looking for a stable cygwin release since 1.5.10, and have not been
convinced. If that sounds arrogant, sorry, it's not meant to be, it
is honest. We use cygwin to control over night windows software builds
with bash scripts, so when the build breaks we don't want cygwin blamed.

I appreciate all the cygwin developers' hard work, I love the tools!

We're in a position to upgrade cygwin on 9 servers, 3 of which
will be purchased in the next couple of months. :->

--
thanks again,
Tom Rodman

--
See below for the snippets { enclosed in curly
braces }.  This is an egrep of last ~200 days of cygwin archives for:

  fork: No such file or directory
  died waiting for longjmp
  fork: Bad file
  fork: Resource temporarily unavai

--v-v------------------C-U-T---H-E-R-E-------------------------v-v-- 
{
  {
  According to David Arnstein on 8/18/2005 1:01 AM:
  > Frequently (but not always) I will launch a Cygwin command window
  > running bash; the new command window prints a message from bash:
  >
  > --------------------------------------------------------------------------
  >      10 [main] bash 1880 pinfo::wait: Couldn't create pipe tracker for
  > pid 3768,
  >
  >  Win32 error 231
  > bash: fork: Resource temporarily unavailable
  > --------------------------------------------------------------------------

  This is not a bug in bash, but a limitation in the number of running
  processes under Windows.  When cygwin cannot create a new process during a
  fork(), you will get this behavior.  Some people have reported success
  turning off (or swapping) antivirus programs, or other tricks to reduce
  the number of running processes on their system.  Other than that, I don't
  have any further ideas that might help you.

  - --
  Life is short - so eat dessert first!

  Eric Blake 
  }
  {
  > This is not a bug in bash, but a limitation in the number of running
  > processes under Windows.  When cygwin cannot create a new process during a
  > fork(), you will get this behavior.  Some people have reported success
  > turning off (or swapping) antivirus programs, or other tricks to reduce
  > the number of running processes on their system.  Other than that, I don't
  > have any further ideas that might help you.
  >

  I recently upgraded to XP on my laptop from 2000. It seems I can run only
  about a hundred processes instead of 200-300. Why is the limit so low? Is
  it adjustable? support and msdn don't seem to say.
  }
}

{
>       9 [main] bash 3624 fork_parent: child 2756 died waiting for
> longjmp before initialization
> bash: fork: Bad file descriptor
> bash-2.05b$
>
> Everything worked fine, nothing was installed or changed in my
> configuration. Interestingly it works for other restricted users and for
> my admin account. Can anybody give me a clue what's going on?

  Well, what that message means is that for some reason, the child process
failed to notify the parent of its existence within a reasonable amount of
time after it was spawned.  That could mean that it died for some reason, or
it could mean that it was alive, but failed to notify the parent process.

  Hmm, would any of the lead developers care to comment on the accuracy of
this comment from the front of fork.cc/sync_with_parent ?
}

{
On Mon, 15 Aug 2005, Reinhard Nissl wrote:

> Michael Schaap wrote:
>
> > note that you *can* rename a running executable.
>
> But isn't renaming a running executable only a feature of recent Windows
> OSs?

FYI, even on WinXP, which does allow renaming a currently running program,
Cygwin will break, since it apparently relies on the application image
being present on-disk when forking.  So, while "mv /bin/bash.exe
/bin/bash1.exe" works, subsequent invocations of any command result in
"bash: fork: No such file or directory".
  Igor
}

{

> At 03:35 PM 4/12/2005, you wrote:
>
>> No joy. Tried replacing cygwin1.dll with the one from 1.5.15
>> (snapshot build); still get the same error.
>
> So are there any more error message lines other than just:
>
> /bin/bash: fork: Resource temporarily unavailable
> /bin/bash: line 1: /usr/bin/find: Resource temporarily unavailable
>
> ?

I've often found such error message to be happening when the machine in
question was actually running out of resources (i.e. some memory leak or
hog running under Windows for example).

}

{
On Thu, 18 Aug 2005, David Arnstein wrote:

> Frequently (but not always) I will launch a Cygwin command window running
> bash; the new command window prints a message from bash:
>
> --------------------------------------------------------------------------
>     10 [main] bash 1880 pinfo::wait: Couldn't create pipe tracker for pid
> 3768,
>
> Win32 error 231
> bash: fork: Resource temporarily unavailable
> --------------------------------------------------------------------------
>
> The bash shell in the command window is unusable. Commands typed into the
> shell return immediately, but have no effect.
>
> Sometimes, a Cygwin/bash command window will launch properly, but then the
> above error message appears in the middle of a lengthy command, such as
> "catman -w".
>
> I found a mailing list article that claimed that this problem could be solved
> by running "rebaseall." I did kill all Cygwin processes and services, and ran
> "rebaseall" from an ash shell. This did not solve my problem.
}

{
Date: Thu, 22 Sep 2005 11:19:06 +0200
From: PSP Blizz <pspblizz@gmail.com>
Reply-To: PSP Blizz <pspblizz@gmail.com>
To: Cygwin Mailinglist <cygwin@cygwin.com>
Subject: Autotools\gcc: "Resource temporarily unavailable" problem
<smip>
I've been trying for a month now to get the PSP toolchain to work
under my windows installation. It always crashes out in a message
"Resource temporarily unavailable".

After this it's not possible to do much in either windows or in
cygwin. Also when trying to restart my computer after susch a crash,
windows has to kill "XPCOM Event viewer" to be able to shut down.

I've included two cygwin logs, one before the "crash" and one after.

The latter log generated the following error message:
$ cygcheck -s -v -r > cygcheck2.out
cygcheck: dump_sysinfo: GetVolumeInformation() failed: 1450
cygcheck: dump_sysinfo: GetVolumeInformation() failed: 1450
cygcheck: dump_sysinfo: GetVolumeInformation() failed: 1450

I've also incuded the last lines from the script that fails, please
remember that using autotools also crashes on any other code building,
not only the psptoolchain:

/tmp/pspdev/gcc-4.0.1/intl/configure: fork: Resource temporarily unavailable
/tmp/pspdev/gcc-4.0.1/intl/configure: fork: Resource temporarily unavailable
checking for setenv... /tmp/pspdev/gcc-4.0.1/intl/configure: line 5758: ${+=
set}:
 bad substitution
/tmp/pspdev/gcc-4.0.1/intl/configure: line 5761: /usr/bin/cat: Resource temporarily unavailable
/tmp/pspdev/gcc-4.0.1/intl/configure: line 5764: /usr/bin/cat: Resource temporarily unavailable
<snip>
My memory is tested both with DELLs tools and memtest86 without
finding anything wrong. And I've tried to kill all security tools
running as suggested by some on this mailinglist earlier (Virus,
Antispyware and firewall).

I've tried this both under the latest release and snapshot, and I've
seen in the mailing list that others have had similar problems. Any
help in where I should look next for a solution is greatly
appreciated!
}

{
Date: Fri, 19 Aug 2005 12:13:18 -0400
<snip>
Subject: Re: fork: Resource temporarily unavailable
<snip>

On Fri, Aug 19, 2005 at 11:13:51AM -0400, Christopher Faylor wrote:
> I've created a new snapshot which may work around this problem by trying
> again when this error is presented.  Could you give it a try?

Thank you Mr. Faylor. I have installed the cygwin1.dll dated August
19. If I see any relevant messages from Cygwin software, I will post
to this mail list. The problem I have is intermittent, so all I can do
is (stress) test the DLL from now to eternity.
}

{

>From: Oliver Vecernik
>Sent: 14 April 2005 10:41

> Hi,
>
> I'm not sure what happened, but if I start Cygwin for a certain
> restricted user I receive the following message:
>
>       9 [main] bash 3624 fork_parent: child 2756 died waiting for
> longjmp before initialization
> bash: fork: Bad file descriptor
> bash-2.05b$
>
> Everything worked fine, nothing was installed or changed in my
> configuration. Interestingly it works for other restricted users and for
> my admin account. Can anybody give me a clue what's going on?

  Well, what that message means is that for some reason, the child process
failed to notify the parent of its existence within a reasonable amount of
time after it was spawned.  That could mean that it died for some reason, or
it could mean that it was alive, but failed to notify the parent process.

  Hmm, would any of the lead developers care to comment on the accuracy of
this comment from the front of fork.cc/sync_with_parent ?

}

{

On Apr 14 11:40, Oliver Vecernik wrote:
> Hi,
>
> I'm not sure what happened, but if I start Cygwin for a certain
> restricted user I receive the following message:
>
>       9 [main] bash 3624 fork_parent: child 2756 died waiting for
> longjmp before initialization
> bash: fork: Bad file descriptor
> bash-2.05b$

Please test the latest snapshot from http://cygwin.com/snapshots/

I'm rather confident that it solves your problem.


Corinna
}

{

At 04:46 PM 7/1/2005, you wrote:
>Hi, all,
>
>While I restart cygwin after I failed to install
>NS2.27 or NS2.28, it shows,
>bash: /usr/bin/id: Resource temporarily unavailable
>      6 [main] bash 62528 fork_parent: child 62624
>died waiting for longjmp befo
>re initialization
>bash: fork: Bad file descriptor
>
>Then I restarted it, it shows
>bash: fork: Resource temporarily unavailable
>
>Your help is appreciated!


Well, I've reviewed your cygcheck output and didn't find anything there
which was obviously wrong.  I'm assuming that the version of 'bash' you
have installed now is the same version as the one you see this problem
with (i.e. bash 2.05b-17 and not the test version 3.0-2).  If that is
the case, then you've answered Brian's (very good) question.  You might
want to take a look at Eric Blake's message on a somewhat related thread:

<http://cygwin.com/ml/cygwin/2005-07/msg00054.html>

The last paragraph may be applicable to your situation.

}

Wed, 11 May 2005 {
I had that problem in the past on our old server and it happened after
certain updates and now it happened again on our new server after some
recent cygwin updates.
I have the sshd server installed as service on the server (Win 2000
server sp4, all latests patches installed). The server is running ISA
2000 and is used as proxy for internet.

After a reboot I am having now this strange sshd error message:
Impossible de trouver la description de l'ID d'?v?nement ( 0 ) dans la
source ( sshd ). L'ordinateur local n'a peut-?tre pas les informations
de Registre n?cessaires ou les fichiers DLL de messagerie pour
afficher les messages provenant d'un ordinateur distant sshd : PID
1828 : error: fork: Resource temporarily unavailable.

The strangest thing is that the sshd server works after a service
restart. And it used to work flawlessly.

I read in the mailing list about the possible lack of memory but the
server has 2go ram and 4 go virtual memory so I doubt that it is the
issue.
What other issues can be involved?

Thanks,

Henri.
}

{
Date: Thu, 21 Jul 2005 00:29:19 -0400
<snip>

After a new install of cygwin, the following errors are reported the
first time a bash shell console or xterm is opened.? The same error
message is displayed after trying to run any application.
After this the console window or xterm can be closed and then reopened,
and the error is gone. After logging out or rebooting, the error appears
again.
?
??? 143 [main] bash 1704 fork_copy: linked dll data/bss pass 0 failed,
0x58D000.
.0x58D020, done 0, windows pid 1724, Win32 error 487
bash: fork: Resource temporarily unavailable
bash-3.00$ ftp
5555933 [main] bash 1704 fork_copy: linked dll data/bss pass 0 failed,
0x58D000.
.0x58D020, done 0, windows pid 1720, Win32 error 487
bash: fork: Resource temporarily unavailable
bash-3.00$
}


more detail on dying ssh session
--v-v------------------C-U-T---H-E-R-E-------------------------v-v-- 
healthy example:

  ~ $ procps -H -Ao pid,ppid,%cpu,user,bsdstart,args
    PID  PPID %CPU USER      START COMMAND
   1660     1  0.0 SYSTEM    08:04 /usr/bin/cygrunsrv
   1716  1660  0.0 SYSTEM    08:04   /usr/sbin/sshd -D
   6032  1716  0.0 SYSTEM    20:30     /usr/sbin/sshd -D -R
   6104  6032  0.0 scmcron   20:30       -bash
   6124  6104  0.3 scmcron   21:43         procps -H -Ao pid,ppid,%cpu,user,bsdstart,args
   1064     1  0.0 SYSTEM    08:04 /usr/bin/cygrunsrv
   1112  1064  0.0 SYSTEM    08:04   /usr/sbin/cron -D
  ~ $ echo $$
  6104

PID 1716 has another sshd (6032) as it's child. The
bash session (PID 6104) that ran procps was in
the ssh session fathered by 6032.

--
Broken example: sshd spontaneously "when belly up" [no error message],
the bash session was gone, and I had to kill the ssh client that
originated from the remote box.  Here's the procps snippet:

  ~ $ procps -H -Ao pid,ppid,%cpu,user,bsdstart,args
  <snip>
  5796   732  0.0 SYSTEM    06:59     /usr/sbin/sshd -D -R
  7168  5796  0.0 bcm_root  06:59       <defunct>


-------------- next part --------------
--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


More information about the Cygwin mailing list