This is the mail archive of the cygwin@sources.redhat.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: Optimizing away "ReadFile" calls when Make calls stat()


> -----Original Message-----
> From: DJ Delorie [mailto:dj@delorie.com]
> Sent: Tuesday, February 13, 2001 8:54 PM
> To: jik-cygwin@curl.com
> Cc: cygwin@cygwin.com
> Subject: Re: Optimizing away "ReadFile" calls when Make calls stat()
> 
> 
> 
> > As I've noted separately, reading tens of thousands of 
> files even once
> > incurs a significant performance penalty.
> 
> True, but reading them all once is better than reading them all twice.
> I'm trying to break the problem down into small enough changes that we
> actually have a chance of implementing them.
> 
> > The change I've proposed can eliminate reading them at all.
> 
> But not in a way that we can make it the default.  Perhaps you could
> propose a set of mount flags to optimize common situations?  We
> already have one to avoid the read-for-execute test, perhaps you could
> work on an assume-no-symlinks flag?  Then we wouldn't need a custom
> make.exe (or any other program).
> 
> > But it does nothing at all for the "usual case" I'm trying to
> > optimize, which is Make stat()ing a file but never reading it.
> 
> It does, because stat() reads the file twice, once to see if it's a
> symlink, and once to see if the executable bit needs to be set.
> 
> > >  These should be easier wins (thus, more doable) than a 
> global cache,
> > >  which NT should be providing itself as part of the disk cache
> > >  subsystem (for local drives, at least).  I don't think it's
> > >  appropriate for cygwin to go beyond this anyway - too many race
> > >  conditions arise.
> > 
> > As far as I know, there are no race conditions in the change I
> > suggested.  In fact, it *removes* race conditions, since it reduces
> > the number of distinct OS operations that must be performed 
> on a file
> > during stat().
> 
> Right, but others were suggesting a global cache of file bytes.
> *That* would introduce race conditions.
> 

Perhaps a solution would be to maintain what could be called a "partial"
stat() cache: maintain a global cache of ALL the result of the ReadFile()s
(that can easily I think reduced to 1) together with the last-time-modified
value.

stat() will then ALWAYS check the last-time-modified of the ACTUAL file,
then check the cache and if the cache is up-to-date, returns the
execute/symlink flags found in the cache. If the cache is obsolete or
absent, just re-read the file's content and save in the cache the
LMT/exec/symlink values.

The only race condition will be when UPDATING the cache (no problem on
reading if we first change exec/symlink then upadte LMT); this should be
simple to handle.

Regretfully I don't have time to look at this (and don't know how it is
effectively implemented now) but this should provide quite a big win for
cygwin.

Regards,

	Bernard

--------------------------------------------
Bernard Dautrevaux
Microprocess Ingenierie
97 bis, rue de Colombes
92400 COURBEVOIE
FRANCE
Tel:	+33 (0) 1 47 68 80 80
Fax:	+33 (0) 1 47 88 97 85
e-mail:	dautrevaux@microprocess.com
		b.dautrevaux@usa.net
-------------------------------------------- 

--
Want to unsubscribe from this list?
Check out: http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]