This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Optimized with SSE2 sinf and cof for x86_32


Repost of the patch with optimized Sinf and Cosf for x86_32.
We are looking forward to accepting and releasing this.

http://sourceware.org/ml/libc-alpha/2012-06/msg00624.html

--
Liubov Dmitrieva
Intel Corporation

2012/6/22 Dmitrieva Liubov <liubov.dmitrieva@gmail.com>:
> This is a patch proposing manually optimized and high-performance sinf
> and cosf versions with excellent precision.
>
> Performance on main path [-10000; 10000] is more than  26X better.
>
> Other important intervals are here (ratio of cycles).
>
>       (random)           Ist.   Bulld.   Atom   Neh.   AVX
>    cosf    |x|<0.78    1,9    2,72   1,65   1,89   1,79  times
>    cosf    |x|<1.57    1,55    1,84   1,75   1,70   1,55  times
>    cosf    |x|<2.35    1,64    2,08   1,78   1,75   1,66  times
>    cosf    |x|<3.14    1,97    2,86   1,97   1,87   2,12  times
>    cosf    |x|<3.92    2,15    3,50   2,08   2,01   2,33  times
>    cosf    |x|<4.71    2,29    3,89   2,15   2,07   2,43  times
>    cosf    |x|<5.49    2,39    4,70   2,21   2,06   2,52  times
>    cosf    |x|<6.28    2,47    4,62   2,25   2,14   2,58  times
>    cosf    |x|<7.06    2,54    4,63   2,28   2,16   2,64  times
>    cosf    |x|<7.85    2,43    4,48   2,27   2,10   2,63  times
>    cosf    |x|<8.63    2,30    4,47   2,23   2,04   2,56  times
>    cosf    |x|<9.42    2,21    4,18   2,20   1,99   2,51  times
>    cosf    |x|<100     2,53    5,43   2,28   2,34   2,01  times
>    cosf    |x|<1000   19,82   20,50  19,88  17,96  18,37 times
>    cosf    |x|<10000  25,98  29,78   24,95  23,63  23,52 times
>    cosf    |x|<1e10   18,92   28,74   20,97 16,16  18,78 times
>
>
>    sinf    |x|<0.78    1,39    1,75   1,31   1,30   1,28  times
>    sinf    |x|<1.57    1,47    1,78   1,65   1,62   1,67  times
>    sinf    |x|<2.35    1,64    2,10   1,77   1,79   1,77  times
>    sinf    |x|<3.14    1,94    2,85   1,95   1,88   2,09  times
>    sinf    |x|<3.92    2,12    3,38   2,04   1,91   2,30  times
>    sinf    |x|<4.71    2,31    3,95   2,14   1,96   2,42  times
>    sinf    |x|<5.49    2,66    4,57   2,21   2,15   2,51  times
>    sinf    |x|<6.28    2,53    4,67   2,24   2,17   2,56  times
>    sinf    |x|<7.06    2,52    4,54   2,23   2,11   2,62  times
>    sinf    |x|<7.85    2,43    4,54   2,24   2,08   2,59  times
>    sinf    |x|<8.63    2,33    4,62   2,21   2,09   2,53  times
>    sinf    |x|<9.42    2,27    4,28   2,17   1,96   2,51  times
>    sinf    |x|<100     2,52    5,32   2,26   2,34   2,01  times
>    sinf    |x|<1000    20,12   20,42  19,89  18,24  18,48 times
>    sinf    |x|<10000   26,26   26,73  25,00  23,11  23,79 times
>    sinf    |x|<1e10    18,76   28,73  20,90  16,09  18,49 times
>
>
>
> Testing passed for new sinf/cosf with our proprietary test system that
> tests on many intervals with different steps, checks for special
> values (from ISO C) and corner cases. Test using “make check” from
> GLIBC was ok too.
>
> Our test system observed more than 1e4 ulp errors for |x|>1e4 for
> current GLIBC. New asm versions, provided here, are maximum 0.500121
> ulp for sinf, 0.500573 ulp for cosf.
>
>
> ChangeLog:
>
> 2012-06-22  Liubov Dmitrieva  <liubov.dmitrieva@gmail.com>
>
>         * sysdeps/i386/i686/fpu/multiarch/Makefile: Update
>         (sysdep_routines): Add s_sinf-sse2, s_conf-sse2
>
>         * sysdeps/i386/i686/fpu/multiarch/s_sinf-sse2.S New file
>         * sysdeps/i386/i686/fpu/multiarch/s_cosf-sse2.S New file
>         * sysdeps/i386/i686/fpu/multiarch/s_sinf.c New file
>         * sysdeps/i386/i686/fpu/multiarch/s_cosf.c New file
>         * sysdeps/ieee754/flt-32/s_sinf.c Update
>         (SINF): Add macro for using routine as __sinf_ia32
>         * sysdeps/ieee754/flt-32/s_cosf.c Update
>         (COSF): Add macro for using routine as __cosf_ia32
>
>         * sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S Fix Copyright
>         * sysdeps/i386/i686/fpu/multiarch/e_expf.c Fix Copyright
>
>
> --
> Liubov Dmitrieva
>
> Software Engineer
> Intel Corporation


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]