This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v1.2] Improve unaligned memcpy and memmove.


Can we make "**back" versions clean up in this patch?
Are there any processors still use it?
Atom and core2 uses "***ssse3" version not the "***back" ones.
Do we need to handle these "***back" versions now?

--
Liubov

On Fri, Oct 4, 2013 at 4:52 PM, OndÅej BÃlka <neleai@seznam.cz> wrote:
> On Fri, Oct 04, 2013 at 03:14:04PM +0400, Liubov Dmitrieva wrote:
>>    I don't understand why you use HAS_SLOW_SSE4_2 flag for Silvermont
>>    version. It is supposed to be named as "Fast_Rep" or something like that
>>    to make the core feature of the version be clear.
>>    There is already HAS_FAST_REP_STRING, maybe it can be reused.
>>    --
>>    Liubov
>>
> It was simplest way to identify silvermont. It is exceptional that rep
> movsq is faster on L1 cache for sizes more than 4096 bytes. For core2 a
> situation is opposite, rep movsq looks fastest for small sizes (upto 256
> bytes) until ssse3 loop pays itself.
>
> It might make sense to do silvermont specific casing as below.
>
> Or there is second possibility that a switching to rep would be done by
> a processor specific table. For silvermont threshold would be 4096
> bytes.
> On nehalem and ivy bridge a loop is faster when data are in L1 cache,
> nearly identical for L2 cache and by far best possible for L3 cache and
> more so we could use treshold of 65636. On fx10 a rep implementation is
> always slower so we would need to disable it.
>
>
>
> ---
>  sysdeps/x86_64/multiarch/init-arch.c | 3 ++-
>  sysdeps/x86_64/multiarch/init-arch.h | 6 ++++++
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/sysdeps/x86_64/multiarch/init-arch.c b/sysdeps/x86_64/multiarch/init-arch.c
> index 5583961..b80d9f2 100644
> --- a/sysdeps/x86_64/multiarch/init-arch.c
> +++ b/sysdeps/x86_64/multiarch/init-arch.c
> @@ -90,7 +90,8 @@ __init_cpu_features (void)
>               __cpu_features.feature[index_Fast_Unaligned_Load]
>                 |= (bit_Fast_Unaligned_Load
>                     | bit_Prefer_PMINUB_for_stringop
> -                   | bit_Slow_SSE4_2);
> +                   | bit_Slow_SSE4_2
> +                   | bit_Is_Silvermont);
>               break;
>
>             default:
> diff --git a/sysdeps/x86_64/multiarch/init-arch.h b/sysdeps/x86_64/multiarch/init-arch.h
> index 0cb5f5b..36ec445 100644
> --- a/sysdeps/x86_64/multiarch/init-arch.h
> +++ b/sysdeps/x86_64/multiarch/init-arch.h
> @@ -24,6 +24,8 @@
>  #define bit_FMA_Usable                 (1 << 7)
>  #define bit_FMA4_Usable                        (1 << 8)
>  #define bit_Slow_SSE4_2                        (1 << 9)
> +#define bit_Is_Silvermont              (1 << 10)
> +
>
>  /* CPUID Feature flags.  */
>
> @@ -64,6 +66,7 @@
>  # define index_FMA_Usable              FEATURE_INDEX_1*FEATURE_SIZE
>  # define index_FMA4_Usable             FEATURE_INDEX_1*FEATURE_SIZE
>  # define index_Slow_SSE4_2             FEATURE_INDEX_1*FEATURE_SIZE
> +# define index_Is_Silvermont           FEATURE_INDEX_1*FEATURE_SIZE
>
>  #else  /* __ASSEMBLER__ */
>
> @@ -163,6 +166,8 @@ extern const struct cpu_features *__get_cpu_features (void)
>  # define index_FMA_Usable              FEATURE_INDEX_1
>  # define index_FMA4_Usable             FEATURE_INDEX_1
>  # define index_Slow_SSE4_2             FEATURE_INDEX_1
> +# define index_Is_Silvermont           FEATURE_INDEX_1
> +
>
>  # define HAS_ARCH_FEATURE(name) \
>    ((__get_cpu_features ()->feature[index_##name] & (bit_##name)) != 0)
> @@ -174,5 +179,6 @@ extern const struct cpu_features *__get_cpu_features (void)
>  # define HAS_AVX                       HAS_ARCH_FEATURE (AVX_Usable)
>  # define HAS_FMA                       HAS_ARCH_FEATURE (FMA_Usable)
>  # define HAS_FMA4                      HAS_ARCH_FEATURE (FMA4_Usable)
> +# define IS_SILVERMONT                 HAS_ARCH_FEATURE (Is_Silvermont)
>
>  #endif /* __ASSEMBLER__ */
> --
> 1.8.4.rc3
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]