This is the mail archive of the crossgcc@sources.redhat.com mailing list for the crossgcc project.

See the CrossGCC FAQ for lots more infromation.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

optimizing gcc output for ARM


If I compile the following lines

  for (i=0;i<10;i++) {
    asm volatile ("mla %0,%3,%2,%1":"=r" (sum):"r" (coeff), "r" (sample), "r" (sum));
  }
I get the assembler output
.L3:
	ldr	r3, [fp, #-28]
	cmp	r3, #9
	ble	.L6
	b	.L4
.L6:
	ldr	r3, [fp, #-16]
	ldr	r2, [fp, #-20]
	ldr	r1, [fp, #-24]
	ldr	ip, [fp, #-16]
	mla r3,r1,r2,r3
	mov	r2, r3
	str	r2, [fp, #-16]
.L5:
	ldr	r3, [fp, #-28]
	add	r2, r3, #1
	str	r2, [fp, #-28]
	b	.L3

(explanation: [fp, #-16]: sum; [fp, #-20]:coeff;  [fp, #-24]:sample)

I can life with the fact that gcc doesn´t recognize the 
special MAC instruction from the ARM, but the line
after it is stupid,
	str	r3, [fp, #-16]
would work fine too, and is one ins. shorter. 
And the line above it is not nessesary, too.
1.) Why does the gcc produce such output?
2.) How can I avoid this?

Thank´s for your hints!
Jens-Christian

-- 


Jens-Christian Lache
Technische Universitaet Hamburg-Harburg
www.tu-harburg.de/~sejl1601
Mail:
lache@tu-harburg.de
lache@ngi.de
Tel.:
+0491759610756

------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]