emacs and large-address awareness under recent snapshots

Mon Aug 8 21:17:00 GMT 2011

On 8/8/2011 4:16 PM, Ken Brown wrote:
> On 8/8/2011 2:20 PM, Achim Gratz wrote:
>> Corinna Vinschen<...>   writes:
>>> still tries to workaround some old problem in the Cygwin sbrk
>>> implementation in Cygwin 1.5.  Unfortunately the comment doesn't contain
>>> any hint as to what exact problem this code is trying to workaround.
>>
>> Apologies if that's obvious and you've already checked that: emacs gets
>> created as a dumpfile of temacs during build, so if peflags moves the
>> heap retroactively thereafter I can't see how it's going to work since
>> part of the heap is where it was during dumping and the rest is, well,
>> somewhere else.  I'd look at the build process first before suspecting
>> the sources â€” I would assume that temacs must also be made large address
>> aware and that it right now just isn't.  There may still be workarounds
>> that aren't needed anymore and bad assumptions about how the memory map
>> looks like in Cygwin.
>
> Thanks for the suggestion, but that doesn't seem to be the issue.  I
> just tried building emacs with LDFLAGS=-Wl,large-address-aware.  That
> should have made temacs and the dumpfile large address aware.  The
> result was that the build didn't finish.  bootstrap-emacs.exe compiled a
> bunch of .el files and then started spinning its wheels, just as in my
> report earlier in this thread.  Attaching gdb and getting a backtrace, I
> again found that emacs was stuck in morecore_nolock, called from
> _malloc_internal_nolock.
>
> Corinna, here's some explanation of the above (and of unexec, which you
> were wondering about.)  The build process for emacs first compiles the C
> source files into an executable temacs.exe, which has no editing
> commands.  It then runs temacs.exe, which loads some lisp files to set
> up the editing environment and then dumps itself as emacs.exe.  The
> dumping is done by unexec, which is defined in unexcw.c.  I think that
> the data in the static heap (from sheap.c) is part of what gets dumped,
> so emacs defines a special version of sbrk (called bss_sbrk) that
> simulates sbrk but uses the static heap instead of the ordinary
> application heap.
>
> I don't think emacs is trying to work around problems in Cygwin's sbrk.
>    In fact, emacs.exe, as opposed to temacs.exe, does use Cygwin's sbrk.
>    You can see this in the function __default_morecore in gmalloc.c,
> which calls bss_sbrk if emacs.exe hasn't yet been dumped (i.e., if
> temacs.exe is running) and Cygwin's sbrk otherwise.
>
> I hope this all makes sense and is correct.  It may or may not be
> relevant to figuring out what goes wrong when the ordinary heap starts
> at 0x80000000.

I built a debug version of emacs, set it for large address awareness, 
let it run for a while, and then attached gdb to it.  It turns out that 
it was stuck in an infinite loop at lines 701-703 of gmalloc.c, with 
newsize = 0:

do
   newsize *= 2;
while ((__malloc_size_t) BLOCK ((char *) result + size) > newsize);

My guess now is that there was some invalid pointer arithmetic somewhere 
that led to this, but I don't have time at the moment to look for it. 
I'll do it later (or tomorrow) if no one beats me to it.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple