Cygwin build system SOOOO SLOOOWWWW ???
Igor Pechtchanski
pechtcha@cs.nyu.edu
Thu Sep 15 14:55:00 GMT 2005
On Thu, 15 Sep 2005, Jan Schormann wrote:
> Let's see ...
>
> > 1) How can I tell what Cygwin is doing? Is there a tool that will
> > tell me what tool is actually running at any given time? Is there
> > any way to tell what Cygwin is doing down in its guts? Does anyone
> > have any other suggestions as to how I might get to the
> > bottom of this?
>
> Below, I'll tell about some suspicions I have about what cygwin might
> actually be doing. To your question, I can offer two Ideas:
>
> - "top" or any Windoze Process Explorer more sophisticated than
> the task manager
> - "strace" - though I haven't ever used it, but from what I know
> this will definitely give you an answer - maybe two much of it ;-)
You can give strace command-line options to show only the kinds of events
you want... See the strace help (or
<http://cygwin.com/cygwin-ug-net/using-utils.html#strace>).
> > 2) Has anyone else experienced speed problems with Cygwin? Has
> > anybody else felt that Cygwin has gotten slower over the last
> > year or so? Are there any guidelines or "tricks" for getting
> > Cygwin to run faster?
>
> a) Forking is more expensive in Windoze.
> On Unix, especially in make environments, you'll often start new
> processes as you're going - and often you'll not even notice. Google
> for "bash tricks" on how to fork less often.
Forking is not as expensive in Windows as it is in Cygwin (especially if
you fork off a Windows process, since Cygwin creates a stub for that).
<http://cygwin.com/acronyms/#PTC>.
> Hint: Don't use "sed" in `backticks` just for simple string
> replacements.
> Much of this can be done in make or bash directly.
> Look at the changes you made - maybe you thought it's more elegant?
FWIW, much of this can be done directly in "make". :-)
> b) This is especially true for shells.
> I'm not really sure on when and where this hits, but under certain
> circumstances, bash needs to parse /etc/passwd when it starts. Do
> you create /etc/passwd from an LDAP directory using mkpasswd?
*bash* itself never parses /etc/passwd. Cygwin does -- every Cygwin
process looks at /etc/passwd on startup. The first Cygwin process
actually reads it, and the rest simply check whether it changed.
However, that's just a file stat -- it doesn't actually query the domain
or LDAP directory (at least after the first invocation -- it does query
the current user then, but I don't think it does that for all users).
> Maybe you have hired some more people last year and it got longer?
> Hint: Try whether it makes a difference if you replace /etc/passwd
> with one that contains only the local users (look at the options for
> mkpasswd).
This shouldn't make a difference for multiple forks.
> c) /bin/sh is now bash, which is now dynamically linked.
> Up until a few months ago, /bin/sh has been "ash", a smaller, but
> less powerfull shell. This has been replaced by bash, to reduce the
> traffic of repeated questions along the lines of "why does my shell
> act different than on linux" (where /bin/sh is bash on most
> distributions).
> If I understood the traffic on this list correctly, bash is now
> dynamically linked, which might have an impact on starting it - I can't
> tell.
It shouldn't. The DLLs are in memory, so any subsequent invocation of
bash will load the cached versions (Windows does that automatically).
> Hint: Don't start bash so often. Create fewer processes, but if you
> must, see if you gain by using ash explicitely instead of bash.
>
> To the gurus - is the following correct?
> `echo blub` starts one process, `echo blub | sed -e 's/b/x/g'`
> starts three: "echo", "sed", and "bash" to implement the pipe.
I'm far from a guru, but let me take a shot at answering this:
If you're talking about running those from bash, then "echo blub" doesn't
start *any* processes -- "echo" is a bash builtin. If you're asking about
make, make will start /bin/sh to execute the "echo" command.
"echo blub | sed -e ..." will start 1 process from bash, and 2 from make
(the 1 extra process is "sed" -- no process is created for the pipe).
FYI, "BLAH=blub; echo ${BLAH//b/x}" will not spawn *any* processes when
run directly from bash.
> d) Beware of lazy evaluation.
> Look at this construct:
> CFLAGS=$(shell find . -type d -name include)
> Read "info make" on setting variables and find out about the
> difference between "=" and ":=". The above will run the find
> again for every single call to the compiler. Along with the
> issues about forking and reading directories and small files,
> this can make a difference of *ages*.
> Hint: See whether you can use less variables, use ":=" more often,
> etc. - and don't use "$(shell ...)" anyway, as stated in a).
> Rather, pre-compute makefiles with all the data hardcoded, using
> ":=".
That's sound advice.
> e) Reading lots of small files seems more expensive on Windoze.
> I don't know about your Makefiles, but traditionally, makefiles are
> spread across project directories (for build hierarchies), and
> makedepend creates even more of that. For one of our applications, I
> roughly calculated that make needs to open, read, and parse well over a
> thousand files (not counting the source or objects or any such thing,
> just the makefiles), just for telling you that all the targets are up to
> date.
> Hint: Phew ...
>
> You see, for our configurations, running make to tell me that *nothing*
> has changed could take up to half an hour. Therefore we introduced some
> magic using Python to generate and split up makefiles two years ago, and
> were down below five minutes again.
If you're using make recursively, google on the evils of recursive make.
If not, please disregard this.
> This is nothing compared to the link time of well over 15 minutes, so
> we started to convert to DLLs for development (released applications
> are still supposed to be linked statically, as they only run on
> dedicated machines). We're currently trying to replace the whole build
> chain by a single daemon written in a decent language - hoping (i) that
> we need only one process for the actual rule system etc., and will only
> start additional processes for the compiler and linker; and hoping (ii)
> that the actual rule set will be much easier to debug. (You know,
> developers come to me and say "but I've only touched this little cpp and
> now everything's getting compiled again and ..." - how do I know what
> really happened?)
>
> > Thanks in advance for any feedback that might help me speed up my
> > builds.
>
> Let's see whether my hints are any good, but you're welcome anyway :-)
HTH,
Igor
--
http://cs.nyu.edu/~pechtcha/
|\ _,,,---,,_ pechtcha@cs.nyu.edu
ZZZzz /,`.-'`' -. ;-;;,_ igor@watson.ibm.com
|,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D.
'---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow!
If there's any real truth it's that the entire multidimensional infinity
of the Universe is almost certainly being run by a bunch of maniacs. /DA
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
More information about the Cygwin
mailing list