This is the mail archive of the crossgcc@sources.redhat.com mailing list for the crossgcc project.

See the CrossGCC FAQ for lots more infromation.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: optimizing gcc output for ARM


Sorry, there was a mistake in the source code. (But the
strange behavior remains the same)
the c-file should look like:

int main(void) {
  int sum=10;
  int  coeff=20;
  int  sample=30;
  int  i=0;

  for (i=0;i<MAX;i++) {
    asm volatile ("mla %0,%2,%3,%1":"=r" (sum):"r" (sum), "r" (coeff), "r" (sample), "r" (sum));
  }
  return 0;
}

And the corresponding assembler output:

@ Generated by gcc 2.95.2 19991024 (release) for ARM/elf
	.file	"multiplikation.c"
.gcc2_compiled.:
.text
	.align	2
	.global	main
	.type	 main,function
main:
	@ args = 0, pretend = 0, frame = 16
	@ frame_needed = 1, current_function_anonymous_args = 0
	mov	ip, sp
	stmfd	sp!, {fp, ip, lr, pc}
	sub	fp, ip, #4
	sub	sp, sp, #16
	bl	__gccmain
	mov	r3, #10
	str	r3, [fp, #-16]
	mov	r3, #20
	str	r3, [fp, #-20]
	mov	r3, #30
	str	r3, [fp, #-24]
	mov	r3, #0
	str	r3, [fp, #-28]
	mov	r3, #0
	str	r3, [fp, #-28]
.L3:
	ldr	r3, [fp, #-28]
	cmp	r3, #9
	ble	.L6
	b	.L4
.L6:
	ldr	r3, [fp, #-16]      /* r3=  sum*/
	ldr	r2, [fp, #-20]      /* r2=  coeff */
	ldr	r1, [fp, #-24]      /* r1=  sample*/
	ldr	ip, [fp, #-16]         <- WHAT FOR???
	mla r3,r2,r1,r3
	mov	r2, r3		   <- str r3, [fp,#-16]  would be faster
	str	r2, [fp, #-16] 
.L5:
	ldr	r3, [fp, #-28]	/* r3=i */
	add	r2, r3, #1
	str	r2, [fp, #-28]
	b	.L3
.L4:
	mov	r0, #0
	b	.L2
.L2:
	ldmea	fp, {fp, sp, pc}
.Lfe1:
	.size	 main,.Lfe1-main

Am Mit, 22 Nov 2000 schrieben Sie:
> If I compile the following lines
> 
>   for (i=0;i<10;i++) {
>     asm volatile ("mla %0,%3,%2,%1":"=r" (sum):"r" (coeff), "r" (sample), "r" (sum));
>   }
> I get the assembler output
> .L3:
> 	ldr	r3, [fp, #-28]
> 	cmp	r3, #9
> 	ble	.L6
> 	b	.L4
> .L6:
> 	ldr	r3, [fp, #-16]
> 	ldr	r2, [fp, #-20]
> 	ldr	r1, [fp, #-24]
> 	ldr	ip, [fp, #-16]
> 	mla r3,r1,r2,r3
> 	mov	r2, r3
> 	str	r2, [fp, #-16]
> .L5:
> 	ldr	r3, [fp, #-28]
> 	add	r2, r3, #1
> 	str	r2, [fp, #-28]
> 	b	.L3
> 
> (explanation: [fp, #-16]: sum; [fp, #-20]:coeff;  [fp, #-24]:sample)
> 
> I can life with the fact that gcc doesn´t recognize the 
> special MAC instruction from the ARM, but the line
> after it is stupid,
> 	str	r3, [fp, #-16]
> would work fine too, and is one ins. shorter. 
> And the line above it is not nessesary, too.
> 1.) Why does the gcc produce such output?
> 2.) How can I avoid this?
> 
> Thank´s for your hints!
> Jens-Christian
> 
> -- 
> 
> 
> Jens-Christian Lache
> Technische Universitaet Hamburg-Harburg
> www.tu-harburg.de/~sejl1601
> Mail:
> lache@tu-harburg.de
> lache@ngi.de
> Tel.:
> +0491759610756
-- 


Jens-Christian Lache
Technische Universitaet Hamburg-Harburg
www.tu-harburg.de/~sejl1601
Mail:
lache@tu-harburg.de
lache@ngi.de
Tel.:
+0491759610756

------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]