This is the mail archive of the crossgcc@sourceware.org mailing list for the crossgcc project.

See crosstool-NG for lots more information.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: How to build ARM hardfp/NEON toolchain with crosstool-ng


On 1 August 2012 02:20, Peter Barada <peter.barada@logicpd.com> wrote:
> On 07/30/2012 05:30 PM, Michael Hope wrote:
>> On 31 July 2012 02:40, Peter Barada <peter.barada@logicpd.com> wrote:
>>> I'm trying to target a Cortex-a8, and want to build a single toolchain
>>> that can be used to build both hardfp as well as NEON executables (I
>>> don't believe I'll ever want to mix those two variants within the same
>>> program).
>>>
>>> If I understand correctly, hard float for VFP3 requires
>>> "-mfloat-abi=hard -mfpu=vfpv3" and NEON requires "-mfloat-abi=softfp
>>> -mfpu=neon".
>> Hi Peter.  The ABI sets how floating point values are passed between
>> functions, while the FPU tells GCC the capabilities of your hardware
>> and what instructions it can use.  NEON is a superset of VFPv3 so you
>> can use the hard float ABI always, default to VFPv3, and then enable
>> NEON for any programs that benefit from it.
>>
>> Our pre-built toolchain is pretty close to what you describe:
>>  https://launchpad.net/linaro-toolchain-binaries
>>
>> It's crosstool-NG based, tuned for the Cortex-A9, uses the latest
>> Linaro releases, and the source is there if you want to tweak it.
>> See:
>>  https://launchpad.net/linaro-toolchain-binaries/trunk/2012.07/+download/README.txt
>>
>> for more.
> Thanks for the pointer, I'll have to give it a whirl.
>
> What flags would you pass to get full NEON, "-O3 -ftree-vectorize
> -mfpu=neon -ffast-math -funsafe-math-optimizations
> -fsingle-precision-constant"?

Heh, all out eh?  Bake the architecture and ABI defaults into GCC
using crosstool-NG's equivalent of --with-arch=armv7-a
--with-tune=cortex-a8 --with-float=hard --with-fpu=neon.  Use -Ofast
for most of the rest, as -Ofast turns on -O3 and -ffast-math, -O3
turns on -ftree-vectorize, and -ffast-math turns on
-funsafe-math-optimizations.  Stay in ARM mode as it's ~5 % faster for
~25 % bigger code size.

If you're running benchmarks like SPEC then it's normal practice to
add '-fno-common'.  If you're making a product then PGO is worthwhile
for speed and LTO for size.

Note that you will see regressions with the vectoriser.  Try your
benchmarks with and without and feel free to report any problems.

-- Michael

--
For unsubscribe information see http://sourceware.org/lists.html#faq


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]