This is the mail archive of the crossgcc@sources.redhat.com mailing list for the crossgcc project.
See the CrossGCC FAQ for lots more infromation.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Christopher Bahns wrote: > > Hello Chris (and Scott), > > Regarding a CrossGCC post you made back in January, I too have to deal > with size increases going from MRI to GNU. In my case the MRI build is > about 94k (of 96k available) and the GNU build is about 107k, which is > over the limit on my flash. If I can just get it close then I can look > into other methods to reduce it further, but at the moment that would be > hopeless. One thing where to start reducing the code size can be the volatile handling. Currently GCC starts to make RISC-like load-change-store (with possible reload) operations for all the stuff declared as volatile. For example the following code (probably not quite sane for m68k) : ----------------------------- clip ------------------------------------ #include <stdio.h> #define ISR_FUNC __attribute__((interrupt)) #define PBDR (*(volatile char *) (0xffffd6)) #define ITU_TSR0 (*(volatile char *) (0xffff67)) #define UART (*(volatile int *) (0xffff80)) volatile int count; extern volatile int flag; /* #define PBDR (*(char *) (0xffffd6)) #define ITU_TSR0 (*(char *) (0xffff67)) #define UART (*(int *) (0xffff80)) int count; extern int flag; */ void ISR_FUNC handle_intr(void) { count++; if (count == 400) { PBDR ^= 0x01; count = 0; flag = 0; } UART = 0; UART = 1; ITU_TSR0 &= 0xFE; } ----------------------------- clip ------------------------------------ generates the following assembly output with optimization '-O' : ----------------------------- clip ------------------------------------ .file "isr_demo1.c" gcc2_compiled.: .text .globl handle_intr .type handle_intr,@function handle_intr: move.l %a0,-(%sp) <---- a0 and d0 will be used move.l %d0,-(%sp) <---- so push them into stack move.l count,%d0 <---- load count addq.l #1,%d0 <---- increment move.l %d0,count <---- store it back move.l count,%d0 <---- reload count cmp.l #400,%d0 jbne .L3 move.l #16777174,%a0 <---- set PBDR address move.b (%a0),%d0 <---- load from it eor.b #1,%d0 <---- set the bit move.b %d0,(%a0) <---- store it back clr.l count <---- why no "load-store" clr.l flag <---- for these then ? .L3: move.l #16777088,%a0 clr.l (%a0) <---- "UART = 0" moveq.l #1,%d0 move.l %d0,(%a0) <---- "UART = 1" lea (-25,%a0),%a0 <---- set ITU_TSR0 address move.b (%a0),%d0 <---- load and.b #-2,%d0 <---- change move.b %d0,(%a0) <---- store it back move.l (%sp)+,%d0 move.l (%sp)+,%a0 rte .Lfe1: .size handle_intr,.Lfe1-handle_intr .comm count,4,2 .ident "GCC: (GNU) 2.95.2 19991024 (release)" ----------------------------- clip ------------------------------------ while not using the volatile (using the commented part instead) will give with optimization : ----------------------------- clip ------------------------------------ .file "isr_demo2.c" gcc2_compiled.: .text .globl handle_intr .type handle_intr,@function handle_intr: move.l %d0,-(%sp) <--- only d0 will be used addq.l #1,count <--- increment count cmp.l #400,count jbne .L3 eor.b #1,16777174 <--- set PBDR bit clr.l count clr.l flag .L3: moveq.l #1,%d0 <--- "UART=1" !!!!! move.l %d0,16777088 and.b #254,16777063 <--- clear the ITU_TSR0 bit move.l (%sp)+,%d0 <--- restore d0 rte .Lfe1: .size handle_intr,.Lfe1-handle_intr .comm count,4,2 .ident "GCC: (GNU) 2.95.2 19991024 (release)" ----------------------------- clip ------------------------------------ As we can see, now the code looks more sane, BUT the "UART=0" was 'optimized' away!!! This is just what the volatile tries to avoid and not using the volatile is the reason for this 'lost code'... When one writes an embedded app where quite a lot I/O-operations will be used, using the volatile with the ports is quite common and not losing any code will be achieved... But as far as I understand, all those direct operations to memory locations could be done although they were declared as volatile. So, why is it necessary to use the : move.l count,%d0 <---- load count addq.l #1,%d0 <---- increment move.l %d0,count <---- store it back move.l count,%d0 <---- reload count for a volatile instead of the : addq.l #1,count <--- increment count ???? Haven't seen any sane explanation for this 'feature'... This 'feature' has been discussed now and then since the last millennium, but there seems to be 'more important' things to fix in GCC. The embedded users are a minority and only they will need to use the volatile. Unfortunately this bug seems to be in the GCC core and fixing it will not be easy for a novice, not even for a more experienced person (I know some quite qualified people having tried to look at this). However I haven't checked the current GCC/egcs snapshots whether this bug is still there... Anyone done this ? So we can only try to remind the GCC experts about this 'bug' now and then... > Scott, > I also noticed that the run time library was much bigger. The newlib function sizes have also been a problem. Targets like H8/300, AVR, M68HC11, MN10200 etc. could benefit from a more simple C-library, especially from a 'minimal' printf(). Some possibilities exist in newlib, like using the integer-only 'iprintf()' everywhere : ----------------------------- clip ------------------------------------ iprintf--write formatted output (integer only) Synopsis #include <stdio.h> int iprintf(const char *format, ...); Description iprintf is a restricted version of printf: it has the same arguments and behavior, save that it cannot perform any floating-point formatting: the f, g, G, e, and F type specifiers are not recognized. ----------------------------- clip ------------------------------------ where possible, instead of the bigger 'printf()' (with vfprintf() etc.). Replacing printf() with iprintf() may happen simply by using a : #define printf iprintf in some suitable place... For example the famous "Hello world" will shrink quite a lot when the 'iprintf()' will be used instead of 'printf()' : ----------------------------- clip ------------------------------------ E:\usr\local\samples>\usr\local\m68k-coff\bin\size *hello.x text data bss dec hex filename 23320 1948 12 25280 62c0 hello.x 8632 1920 12 10564 2944 ihello.x ----------------------------- clip ------------------------------------ So 14688 byte less if not any 'printf()' derivatives (like sprintf()) will be needed for floats anywhere... Probably there are quite a lot more possibilities in newlib to reduce the code size, but the printf-case is the best known. Cheers, Kai ------ Want more information? See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/ Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |