This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: PowerPC: memset optimization for POWER8/PPC64
- From: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>
- To: libc-alpha at sourceware dot org, Allan McRae <allan at archlinux dot org>
- Date: Tue, 22 Jul 2014 09:59:20 -0300
- Subject: Re: PowerPC: memset optimization for POWER8/PPC64
- Authentication-results: sourceware.org; auth=none
- References: <53C920CD dot 8030506 at linux dot vnet dot ibm dot com>
Hi Allan,
How is the plans for code freeze, do we still have time to push it and the
bzero cleanup [1] ?
[1] https://sourceware.org/ml/libc-alpha/2014-07/msg00447.html
On 18-07-2014 10:27, Adhemerval Zanella wrote:
> This patch adds an optimized memset implementation for POWER8. For
> sizes from 0 to 255 bytes, a word/doubleword algorithm similar to
> POWER7 optimized one is used.
>
> For size higher than 255 two strategies are used:
>
> 1. If the constant is different than 0, the memory is written with
> altivec vector instruction;
>
> 2. If constant is 0, dbcz instructions are used. The loop is unrolled
> to clear 512 byte at time.
>
> Using vector instructions increases throughput considerable, with a double
> performance for sizes larger than 1024. The dcbz loops unrolls also shows
> performance improvement, by doubling throughput for sizes larger than
> 8192 bytes.
>
> Tested on powerpc64 and powerpc64le (POWER8), GLIBC benchmark output attached.
>
>