Cygwin Filesystem Performance degradation 1.7.5 vs 1.7.7, and methods for improving performance

Christopher Faylor
Wed Sep 29 19:03:00 GMT 2010

On Wed, Sep 29, 2010 at 07:48:32PM +0200, Derry Shribman wrote:
> > Right.  Another way of looking at this is that the mount options offer
> > consistency.  The notion of setting an environment variable in Window A
> > to get one behavior and not settting it in Window B is, IMO, a support
> > nightmare and a recipe for end-user confusion.
>1) for applications that internally will set this setting there is nothing 
>confusing on non-consistant.

To be clear: we are definitely not going to create an interface (and I
use the term loosely) to Cygwin which requires a program to internally
use putenv() or setenv() to change behavior.  That is just a really bad
way to implement this type of thing since it is not how environment
variables are meant to be used.  Environment variables are meant to be
used to externally control the behavior of a program.

Consider what an application writer would have to do to get a fast stat.
They'd have to save the current version of the Cygwin environment variable
in a buffer and add something like " faststat" to it.  This presupposes
that the Cygwin environment variable contains nothing but space separated
tokens so it means that we will not be extending its format any time soon.

We could introduce a way to manipulate the Cygwin environment variable
but if we did that, we might as well just implement something like

>2) for users that want to set this: setting PATH can cause the same 
>application/bash script to behave completely differently. same goes for 
>LD_LIBRARY_PATH, SHELL, COMSPEC, TMPDIR etc. They can cause the same application 
>in different shells to behave differently. This causes confusion only for 
>end-users who touch things they dont understand what they do. If you dont know 
>what LD_LIBRARY_PATH is: dont touch it! Unix system does not try to protect 
>itself from ignorent end-users who touch things that they are not supposed to 
>touch (unlike GUI applications which try to). Nothing will protect against an 
>end user setting an incorrect PATH. If an end user does not know what PATH is: 
>he should not touch it!

You probably should let me worry about support issues.  I'm not worried
about esoteric things like what Unix systems try to do.  I'm concerned
about mailing list traffic.  If we introduce something that is confusing
to end users they will send email and it will increase the support
burden for everyone who volunteers to work on Cygwin.  I'm not against
using a philsophic reason as a justification for a change but I'm damn
well not going to introduce a change which causes confusion if there are
other, better ways to get the job done.

>>Or, another way of looking at this is, instead of implementing their
>>own potentially buggy, imprecise stat() they could have not thought of
>>Cygwin as a black box and either 1) offered improvements for the DLL or
>>2) engaged the Cygwin community with requirements.  If there is
>>ifdef'ed __CYGWIN__ code in git now that means that any performance
>>improvements that we (i.e., Corinna) has made will never be noticed and
>>that code will be maintained forever.
>And this is exactly what Yoni Londner is trying to do: He not only
>complained about performance: but gave a practical patch to for using
>setenv("CYGWIN") to solve the performance problems.

I didn't realize that Yoni Londner was a git developer.  When I do a
google search on "londner git" I don't see many hits so I'll have to
take your word for it.

>I am sure git developers were not happy to have to write their own
>version of stat() specially for __CYGWIN__.  But it seems here that the
>simple to implement setenv("CYGWIN", "no_ino no_nlink") is being
>rejected without any good reason.

You're apparently ignoring the reasons which have been put forth
already.  If we think we can get performance improvements without
requiring an environment variable then we will pursue those.

There were also several problems with the patch, the top two being: 1)
it was large enough to require a license assignment and 2) it was too
large to be accepted as is and needs to be broken up into smaller
pieces.  You should not be characterizing this as a stubborn refusal to
incorporate the patch since those two general points have already been
made.  And, Corinna has had more specific issues as well.

Also, if the project leads and other important Cygwin contributors think
that an environment variable is a bad idea then continuing to insist
that it isn't does not really advance the discussion.

>>So, you're trading ifdef __CYGWIN__ in git with lots of if's in the
>>very parts of Cygwin code path where people complain about slowness.
>The slowness of the cygwin filesystem calls do not come from if()'s in
>Cygwin's code.
>A typical CPU today can perform around 1,000,000,000 if()'s per second
>(around 1 nano second per if()).  While the 'cost' of WinNT system call
>is a minimum of 20,000ns, while many filesystem calls are much much

Yes, point taken.  I'm familiar with this argument.  People make it to
me all of the time.  "What's one more function call/if/for-loop?  CPUs
are FAST!" I think I've heard that argument hundreds, if not thousands
of times.

>So adding an if() to save a system call (or even 10 if()s...) - is
>always worth it.

I'm not talking about adding one if.  I'm talking about increasing the
code complexity and slowing down, however marginally, the code path for
the common use case.  If/when we add xstat() we'll have to be very
careful about how we do it.  Hopefully, we can implement it in such a
way as to avoid a tangled tree of ifs.


More information about the Cygwin-developers mailing list