Setup 2.774 texlive postinstall takes 10+ hours)

Ken Brown kbrown@cornell.edu
Thu Nov 13 15:45:00 GMT 2014


On 11/13/2014 9:18 AM, Corinna Vinschen wrote:
> On Nov 11 12:53, Corinna Vinschen wrote:
>> On Nov 10 22:33, Yaakov Selkowitz wrote:
>>> On 2014-11-10 22:23, Yaakov Selkowitz wrote:
>>>> Dependency order of packages: libgcc1 base-cygwin cygwin dash tzcode
>>>> libstdc++6 terminfo sed gzip libpcre1 grep libreadline7 bash
>>>> libncursesw10
>>> [snip]
>>>
>>> Now that I think about it, regardless of libgcc1, that still doesn't make
>>> much sense.  sed, grep, and bash depend on libintl8, which depends on
>>> libiconv2, and libreadline7 (which is required by bash) itself depends on
>>> libncursesw10, so that should be at least two places earlier.  All of those
>>> dependencies are listed in setup.hint (and hence setup.ini), so is there
>>> something wrong with setup itself?
>>
>> What about dependency loops?
>>
>> AFAICS, coreutils depends on tzcode, tzcode depends on coreutils.  Both
>> depend on libgcc1.  This introduces a big problem in dependency
>> resolution because there's no unambiguous starting point.
>>
>> What if we remove the coretuls dep from tzcode.
>>
>> Or, actually, what if we make sure that Base packages only depend
>> on libs, but never on any other Base package?
>
> Ok, now after a collegue of mine informed me about the existence of
> tsort (*blush*), I could finally produce some more info about loops
> in our dependencies.  I wrote a simple script:
>
> awk '/^@ /{ left=$2; }
>       /^requires: /{ for (i=2; i<=NF; ++i) print left " " $i; }
>      ' < setup.ini | tsort
>
> It found the following dependency loops which have to be fixed.
> Most notably are the dep loops of _autorebase and _update-info-dir,
> which we'll fix ASAP.
>
>    GConf2 -> libgconf2_4 -> gconf-desktop-schemas -> GConf2
>
>    xf86-video-dummy -> xorg-server -> xf86-video-dummy
>
>    xf86-video-nested -> xorg-server -> xf86-video-nested
>
>    texlive -> texlive-collection-basic -> texlive
>
>    libautotrace3 -> libMagickCore5 -> libautotrace3
>
>    libgeoclue0 -> geoclue -> libgeoclue0
>
>    shared-mime-info -> libglib2.0_0 -> shared-mime-info
>
>    libatspi0 -> at-spi2-core -> libatspi0
>
>    libfam0 -> gamin -> libglib2.0_0 -> libfam0
>
>    gsettings-desktop-schemas -> libglib2.0_0 -> gsettings-desktop-schemas
>
>    libdbus1_3 -> dbus -> libdbus1_3
>
>    php-Archive_Tar -> php-PEAR -> php-Archive_Tar
>
>    autogen -> libopts-devel -> autogen
>
>    python-libxslt -> python-libxml2 -> python-libxslt
>
>    libopenldap2_4_2 -> libsasl2_3 -> libopenldap2_4_2
>
>    perl-Mozilla-CA -> perl-IO-Socket-SSL -> perl-Mozilla-CA
>
>    rubygems -> ruby-io-console -> ruby -> rubygems
>
>    ruby-rake -> rubygems -> ruby-rake
>
>    ruby-rdoc -> rubygems -> ruby-rdoc
>
>    ruby-rdoc -> rubygems -> ruby-io-console -> ruby -> ruby-rdoc
>
>    mingw64-i686-runtime -> mingw64-i686-gcc-core -> mingw64-i686-runtime
>
>    mingw64-x86_64-runtime -> mingw64-x86_64-gcc-core -> mingw64-x86_64-runtime
>
>    _autorebase -> rebase -> sed -> libintl8 -> libiconv2 -> libgcc1 -> _autorebase
>
>    _autorebase -> rebase -> coreutils -> libgmp10 -> _autorebase
>
>    tesseract-ocr-eng -> tesseract-ocr -> tesseract-ocr-eng
>
>    openmpi -> libopenmpi -> openmpi
>
>    _autorebase -> rebase -> grep -> libpcre1 -> _autorebase
>
>    _autorebase -> rebase -> grep -> bash -> libreadline7 -> libncursesw10 -> libstdc++6 -> _autorebase
>
>    _update-info-dir -> bash -> _update-info-dir
>
>    _update-info-dir -> bash -> coreutils -> _update-info-dir
>
>    _update-info-dir -> info -> _update-info-dir

Many of the dependency loops are harmless.  If two packages A and B are 
involved in a loop, and if they both provide postinstall scripts, then 
you can't be sure which script will run first.  So we only have to worry 
about those loops in which the order is important.

The real problem here is that the "requires" line in setup.ini is being 
used for two unrelated purposes.  The first one is to make sure that if 
package A requires package B in order to run properly, and if A is 
chosen for install, then so is B.  For this purpose, loops are not only 
harmless, they're sometimes necessary.  For example, the dependency loop 
between texlive and texlive-collection-basic is completely appropriate. 
  How else can we make sure that if one is chosen, then so is the other?

The second purpose is to determine the order of running postinstall 
scripts, and this is where loops are bad.  We need to rethink how 
postinstall order is determined.  What about just adding a provision for 
specifying postinstall dependencies, independent of the current 
"requires" line?  We've already discussed a couple of situations where 
this would be useful:

* base-cygwin needs to run first;
* autorebase should be run as early as possible.

A third one concerns texlive.  I could greatly speed up the texlive 
postinstall scripts if I had a package (maybe called "_texlive_post") 
that provided a script to be run after all other texlive scripts.

There's one final idea I'd like to throw out, possibly as an alternative 
to Achim's perpetual postinstall scripts: It would be useful to be able 
to specify that a certain package (such as _autorebase, or my proposed 
_texlive_post) should always be selected for *reinstall* whenever a 
package that depends on it is installed.

Ken

P.S. If there is support for any of my suggestions, I'll do all I can to 
help with the implementation.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list