This is the mail archive of the crossgcc@sourceware.org mailing list for the crossgcc project.

See the CrossGCC FAQ for lots more information.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Poor performance on software cross-compiled for MinGW


Bill Gatliff wrote:
Toralf Lund wrote:
I just discovered that the output from my Linux-hosted MinGW gcc cross compiler has some performance issues. That is, I have some code that runs 4-5 times faster on Linux (Red Hat Enterprise 4) when built using the standard compiler there, and using the same options as for the MinGW build, than the cross-compiled code does on Windows. The hardware is identical, and the job consists mainly of raw processing, so I'm inclined to blame it on the compiler rather than OS differences or similar. I'm not using any -O... flags at this time.

Cross compiler version is 3.4.2, with http://surfnet.dl.sourceforge.net/sourceforge/mingw/gcc-3.4.2-20040916-1-src.diff.gz applied and otherwise built using the standard procedure, if there is such a thing. I'll probably write up all the gory details later, but thought I might send a quick post first just to ask for ideas about where to start looking for the cause of the performance gap.

If the cross and native compilers are the same versions, then their assembly language output should be virtually identical. Diff the asm for one of your hotspot functions, and see if there are major differences.
Good idea. I should have thought of that.

Now, doing this I have established that the actual computations are identical in the sense that both version use the same mul/div/sub/add commands - so different CPU or FPU type setup does not seem to be the problem.

On the other hand, there are some differences in the way the stack and registers are used on function calls etc., and I think maybe the MinGW variant addresses memory in ways that will make it somewhat slower.

For instance, these assembly lines from the MinGW C++ compiler:

__ZNK6IMBblk11getPixelRowERSt6vectorIiSaIiEEi:
   pushl    %ebp
   movl    %esp, %ebp
   pushl    %ebx
   subl    $116, %esp
   movl    8(%ebp), %eax
   movl    (%eax), %eax
   addl    $8, %eax
   movl    %eax, (%esp)
   call    __ZNK12IMBreferenceI10IMBblkInfoEptEv
   movl    4(%eax), %eax
   movl    %eax, 4(%esp)
   movl    12(%ebp), %eax
   movl    %eax, (%esp)
   call    __ZNSt6vectorIiSaIiEE6resizeEj
   movl    12(%ebp), %eax
   movl    %eax, (%esp)
   call    __ZNSt6vectorIiSaIiEE5beginEv
   movl    %eax, -12(%ebp)
   movl    8(%ebp), %eax
   movl    %eax, (%esp)
   call    __ZNK6IMBblk11getDataCharEv

Has the following Linux equivalent


_ZNK6IMBblk11getPixelRowERSt6vectorIiSaIiEEi: .LFB3488: pushl %ebp .LCFI830: movl %esp, %ebp .LCFI831: pushl %ebx .LCFI832: subl $100, %esp .LCFI833: subl $8, %esp subl $4, %esp movl 8(%ebp), %eax movl (%eax), %eax addl $8, %eax pushl %eax .LCFI834: call _ZNK12IMBreferenceI10IMBblkInfoEptEv addl $8, %esp pushl 4(%eax) pushl 12(%ebp) .LCFI835: call _ZNSt6vectorIiSaIiEE6resizeEj addl $16, %esp leal -12(%ebp), %eax subl $8, %esp pushl 12(%ebp) pushl %eax call _ZNSt6vectorIiSaIiEE5beginEv addl $12, %esp subl $12, %esp pushl 8(%ebp) call _ZNK6IMBblk11getDataCharEv


I think this corresponds to the following lines of code:


void IMBblk::getPixelRow(std::vector<int> &pixels, int row) const
{
 pixels.resize(store->blkInfo->width);

 std::vector<int>::iterator p=pixels.begin();
 char *data=getDataChar();


Maybe you expect such differences due to plaform specific call conventions etc, though, and I somehow doubt that they explain the performance gap. So I guess I have to keep looking...


- T


-- For unsubscribe information see http://sourceware.org/lists.html#faq


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]