This is the mail archive of the
newlib@sourceware.org
mailing list for the newlib project.
Re: [PATCH/RFA]: Optimize wctomb/mbtowc functions for speed
- From: Corinna Vinschen <vinschen at redhat dot com>
- To: newlib at sourceware dot org
- Date: Fri, 13 Nov 2009 11:30:13 +0100
- Subject: Re: [PATCH/RFA]: Optimize wctomb/mbtowc functions for speed
- References: <20091106195553.GA1488@calimero.vinschen.de>
- Reply-to: newlib at sourceware dot org
Ping?
Corinna
On Nov 6 20:55, Corinna Vinschen wrote:
> Hi,
>
> as a result of a discussion on the Cygwin list it came up that calling
> mbrtowc on Cygwin is rather slow. So I started to experiment with it
> and it turned out that the major reason for the slowness is apparently
> the fact that mbrtowc results in calling four functions:
>
> mbrtowc -> _mbrtowc_r -> _mbtowc_r -> __mbtowc
>
> Since the actual function behind the __mbtowc pointer is functionally
> equivalent to what the application expects when calling mbrtowc, I
> started to optimize away the intermediate functions. The surprising
> result (to me, at least) is that a loop like this
>
> const char in[] = "The quick brown fox jumps over the lazy dog";
> for (count = 0; count < 1000000; ++count)
> for (i = 0; i < size; i += mbclen)
> if ((int)(mbclen = mbrtowc(&wc, in + i, size - i, &mbs)) <= 0)
> break;
>
> is about 45% faster if the function called by the application call
> __mbtowc directly rather than via the usual call path. So I did the
> same for wcrtomb with a similar result, 40-45% performance gain.
>
> The below patch is the final result. The basic idea is to call
> __mbtowc and __wctomb instead of _mbtowc_r and _wctomb_r everywhere.
> The simplest solution was IMHO to overload _mbtowc_r and _wctomb_r
> with macros of the same name in local.h, so the change was basically
> to add local.h to the affected source files. If that's not wanted,
> I can replace this with changing the sources to call the __mbtowc and
> __wctomb functions directly, rather than via the macros.
>
> Additionally, the functions mbrtowc and wcrtomb are now implemented
> independently from _mbrtowc_r and _wcrtomb_r, unless
> PREFER_SIZE_OVER_SPEED is defined.
>
> Is that ok to check in?
>
>
> Thanks,
> Corinna
>
>
> * libc/stdlib/local.h (_mbtowc_r): Define as call to __mbtowc.
> (_wctomb_r): Define as call to __wctomb.
> * libc/stdlib/btowc.c: Include local.h.
> * libc/stdlib/mblen.c: Ditto.
> * libc/stdlib/mblen_r.c: Ditto.
> * libc/stdlib/mbrtowc.c: Ditto.
> (mbrtowc): Implement independently from _mbrtowc_r, unless
> PREFER_SIZE_OVER_SPEED is defined.
> * libc/stdlib/mbstowcs_r.c: Include local.h.
> * libc/stdlib/mbtowc.c: Ditto.
> * libc/stdlib/mbtowc_r.c: Undefine _mbtowc_r.
> (__utf8_mbtowc): Drop unnecessary test for ch >= 0.
> * libc/stdlib/wcrtomb.c: Include local.h.
> (wcrtomb): Implement independently from _wcrtomb_r, unless
> PREFER_SIZE_OVER_SPEED is defined.
> * libc/stdlib/wcsnrtombs.c: Include local.h.
> (_wcsnrtombs_r): Call _wctomb_r rather than _wcrtomb_r.
> * libc/stdlib/wcstombs_r.c: Include local.h.
> * libc/stdlib/wctob.c: Ditto.
> * libc/stdlib/wctomb.c: Ditto.
> * libc/stdlib/wctomb_r.c: Undefine _wctomb_r.
> * libc/stdio/vfprintf.c: Include ../stdlib/local.h.
> * libc/stdio/vfscanf.c: Ditto.
>
>
> Index: libc/stdlib/btowc.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/btowc.c,v
> retrieving revision 1.2
> diff -u -p -r1.2 btowc.c
> --- libc/stdlib/btowc.c 29 May 2007 21:26:59 -0000 1.2
> +++ libc/stdlib/btowc.c 6 Nov 2009 19:50:17 -0000
> @@ -3,6 +3,7 @@
> #include <stdio.h>
> #include <reent.h>
> #include <string.h>
> +#include "local.h"
>
> wint_t
> btowc (int c)
> Index: libc/stdlib/local.h
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/local.h,v
> retrieving revision 1.8
> diff -u -p -r1.8 local.h
> --- libc/stdlib/local.h 25 Aug 2009 18:47:24 -0000 1.8
> +++ libc/stdlib/local.h 6 Nov 2009 19:50:17 -0000
> @@ -7,6 +7,12 @@ char * _EXFUN(_gcvt,(struct _reent *, do
>
> char *__locale_charset(_NOARGS);
>
> +/* Overriding calls to _mbtowc_r and _wctomb_r with direct calls to __mbtowc
> + and __wctomb can speed up applications enormously. The extra function call
> + for each invocation can take up a lot of time. */
> +#define _mbtowc_r(p,w,s,n,ps) __mbtowc((p),(w),(s),(n),__locale_charset(),(ps))
> +#define _wctomb_r(p,s,w,ps) __wctomb((p),(s),(w),__locale_charset(),(ps))
> +
> #ifndef __mbstate_t_defined
> #include <wchar.h>
> #endif
> Index: libc/stdlib/mblen.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/mblen.c,v
> retrieving revision 1.5
> diff -u -p -r1.5 mblen.c
> --- libc/stdlib/mblen.c 23 Apr 2004 21:44:22 -0000 1.5
> +++ libc/stdlib/mblen.c 6 Nov 2009 19:50:17 -0000
> @@ -46,6 +46,7 @@ effects vary with the locale.
> #include <newlib.h>
> #include <stdlib.h>
> #include <wchar.h>
> +#include "local.h"
>
> int
> _DEFUN (mblen, (s, n),
> Index: libc/stdlib/mblen_r.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/mblen_r.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 mblen_r.c
> --- libc/stdlib/mblen_r.c 23 Apr 2004 21:44:22 -0000 1.4
> +++ libc/stdlib/mblen_r.c 6 Nov 2009 19:50:17 -0000
> @@ -46,6 +46,7 @@ effects vary with the locale.
> #include <newlib.h>
> #include <stdlib.h>
> #include <wchar.h>
> +#include "local.h"
>
> int
> _DEFUN (_mblen_r, (r, s, n, state),
> Index: libc/stdlib/mbrtowc.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/mbrtowc.c,v
> retrieving revision 1.5
> diff -u -p -r1.5 mbrtowc.c
> --- libc/stdlib/mbrtowc.c 23 Apr 2004 21:44:22 -0000 1.5
> +++ libc/stdlib/mbrtowc.c 6 Nov 2009 19:50:17 -0000
> @@ -5,6 +5,7 @@
> #include <stdio.h>
> #include <errno.h>
> #include <string.h>
> +#include "local.h"
>
> size_t
> _DEFUN (_mbrtowc_r, (ptr, pwc, s, n, ps),
> @@ -47,6 +48,32 @@ _DEFUN (mbrtowc, (pwc, s, n, ps),
> size_t n _AND
> mbstate_t *ps)
> {
> +#ifdef PREFER_SIZE_OVER_SPEED
> return _mbrtowc_r (_REENT, pwc, s, n, ps);
> +#else
> + int retval = 0;
> +
> +#ifdef _MB_CAPABLE
> + if (ps == NULL)
> + {
> + _REENT_CHECK_MISC(_REENT);
> + ps = &(_REENT_MBRTOWC_STATE(_REENT));
> + }
> +#endif
> +
> + if (s == NULL)
> + retval = _mbtowc_r (_REENT, NULL, "", 1, ps);
> + else
> + retval = _mbtowc_r (_REENT, pwc, s, n, ps);
> +
> + if (retval == -1)
> + {
> + ps->__count = 0;
> + _REENT->_errno = EILSEQ;
> + return (size_t)(-1);
> + }
> + else
> + return (size_t)retval;
> +#endif
> }
> #endif /* !_REENT_ONLY */
> Index: libc/stdlib/mbstowcs_r.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/mbstowcs_r.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 mbstowcs_r.c
> --- libc/stdlib/mbstowcs_r.c 17 Mar 2009 12:16:28 -0000 1.4
> +++ libc/stdlib/mbstowcs_r.c 6 Nov 2009 19:50:17 -0000
> @@ -1,5 +1,6 @@
> #include <stdlib.h>
> #include <wchar.h>
> +#include "local.h"
>
> size_t
> _DEFUN (_mbstowcs_r, (reent, pwcs, s, n, state),
> Index: libc/stdlib/mbtowc.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/mbtowc.c,v
> retrieving revision 1.5
> diff -u -p -r1.5 mbtowc.c
> --- libc/stdlib/mbtowc.c 23 Apr 2004 21:44:22 -0000 1.5
> +++ libc/stdlib/mbtowc.c 6 Nov 2009 19:50:17 -0000
> @@ -54,6 +54,7 @@ effects vary with the locale.
> #include <newlib.h>
> #include <stdlib.h>
> #include <wchar.h>
> +#include "local.h"
>
> int
> _DEFUN (mbtowc, (pwc, s, n),
> Index: libc/stdlib/mbtowc_r.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/mbtowc_r.c,v
> retrieving revision 1.17
> diff -u -p -r1.17 mbtowc_r.c
> --- libc/stdlib/mbtowc_r.c 3 Oct 2009 08:51:07 -0000 1.17
> +++ libc/stdlib/mbtowc_r.c 6 Nov 2009 19:50:17 -0000
> @@ -7,6 +7,9 @@
> #include <errno.h>
> #include "local.h"
>
> +/* Disable macro definition from local.h. */
> +#undef _mbtowc_r
> +
> int (*__mbtowc) (struct _reent *, wchar_t *, const char *, size_t,
> const char *, mbstate_t *)
> #ifdef __CYGWIN__
> @@ -221,7 +224,7 @@ _DEFUN (__utf8_mbtowc, (r, pwc, s, n, ch
> return 0; /* s points to the null character */
> }
>
> - if (ch >= 0x0 && ch <= 0x7f)
> + if (ch <= 0x7f)
> {
> /* single-byte sequence */
> state->__count = 0;
> Index: libc/stdlib/wcrtomb.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/wcrtomb.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 wcrtomb.c
> --- libc/stdlib/wcrtomb.c 23 Apr 2004 21:44:22 -0000 1.4
> +++ libc/stdlib/wcrtomb.c 6 Nov 2009 19:50:17 -0000
> @@ -4,6 +4,7 @@
> #include <stdlib.h>
> #include <stdio.h>
> #include <errno.h>
> +#include "local.h"
>
> size_t
> _DEFUN (_wcrtomb_r, (ptr, s, wc, ps),
> @@ -45,6 +46,33 @@ _DEFUN (wcrtomb, (s, wc, ps),
> wchar_t wc _AND
> mbstate_t *ps)
> {
> +#ifdef PREFER_SIZE_OVER_SPEED
> return _wcrtomb_r (_REENT, s, wc, ps);
> +#else
> + int retval = 0;
> + char buf[10];
> +
> +#ifdef _MB_CAPABLE
> + if (ps == NULL)
> + {
> + _REENT_CHECK_MISC(_REENT);
> + ps = &(_REENT_WCRTOMB_STATE(_REENT));
> + }
> +#endif
> +
> + if (s == NULL)
> + retval = _wctomb_r (_REENT, buf, L'\0', ps);
> + else
> + retval = _wctomb_r (_REENT, s, wc, ps);
> +
> + if (retval == -1)
> + {
> + ps->__count = 0;
> + _REENT->_errno = EILSEQ;
> + return (size_t)(-1);
> + }
> + else
> + return (size_t)retval;
> +#endif
> }
> #endif /* !_REENT_ONLY */
> Index: libc/stdlib/wcsnrtombs.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/wcsnrtombs.c,v
> retrieving revision 1.1
> diff -u -p -r1.1 wcsnrtombs.c
> --- libc/stdlib/wcsnrtombs.c 19 Feb 2009 09:19:42 -0000 1.1
> +++ libc/stdlib/wcsnrtombs.c 6 Nov 2009 19:50:17 -0000
> @@ -99,6 +99,7 @@ PORTABILITY
> #include <stdlib.h>
> #include <stdio.h>
> #include <errno.h>
> +#include "local.h"
>
> size_t
> _DEFUN (_wcsnrtombs_r, (r, dst, src, nwc, len, ps),
> @@ -134,7 +135,7 @@ _DEFUN (_wcsnrtombs_r, (r, dst, src, nwc
> {
> int count = ps->__count;
> wint_t wch = ps->__value.__wch;
> - int bytes = _wcrtomb_r (r, buff, *pwcs, ps);
> + int bytes = _wctomb_r (r, buff, *pwcs, ps);
> if (bytes == -1)
> {
> r->_errno = EILSEQ;
> Index: libc/stdlib/wcstombs_r.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/wcstombs_r.c,v
> retrieving revision 1.3
> diff -u -p -r1.3 wcstombs_r.c
> --- libc/stdlib/wcstombs_r.c 23 Oct 2007 19:50:29 -0000 1.3
> +++ libc/stdlib/wcstombs_r.c 6 Nov 2009 19:50:17 -0000
> @@ -1,5 +1,6 @@
> #include <stdlib.h>
> #include <wchar.h>
> +#include "local.h"
>
> size_t
> _DEFUN (_wcstombs_r, (reent, s, pwcs, n, state),
> Index: libc/stdlib/wctob.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/wctob.c,v
> retrieving revision 1.3
> diff -u -p -r1.3 wctob.c
> --- libc/stdlib/wctob.c 29 May 2007 21:26:59 -0000 1.3
> +++ libc/stdlib/wctob.c 6 Nov 2009 19:50:18 -0000
> @@ -3,6 +3,7 @@
> #include <stdlib.h>
> #include <stdio.h>
> #include <string.h>
> +#include "local.h"
>
> int
> wctob (wint_t c)
> Index: libc/stdlib/wctomb.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/wctomb.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 wctomb.c
> --- libc/stdlib/wctomb.c 2 Mar 2009 23:30:59 -0000 1.4
> +++ libc/stdlib/wctomb.c 6 Nov 2009 19:50:18 -0000
> @@ -49,6 +49,7 @@ effects vary with the locale.
> #include <newlib.h>
> #include <stdlib.h>
> #include <errno.h>
> +#include "local.h"
>
> int
> _DEFUN (wctomb, (s, wchar),
> Index: libc/stdlib/wctomb_r.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/wctomb_r.c,v
> retrieving revision 1.16
> diff -u -p -r1.16 wctomb_r.c
> --- libc/stdlib/wctomb_r.c 3 Oct 2009 08:51:07 -0000 1.16
> +++ libc/stdlib/wctomb_r.c 6 Nov 2009 19:50:18 -0000
> @@ -6,6 +6,9 @@
> #include "mbctype.h"
> #include "local.h"
>
> +/* Disable macro definition from local.h. */
> +#undef _wctomb_r
> +
> int (*__wctomb) (struct _reent *, char *, wchar_t, const char *charset,
> mbstate_t *)
> #ifdef __CYGWIN__
> Index: libc/stdio/vfprintf.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdio/vfprintf.c,v
> retrieving revision 1.75
> diff -u -p -r1.75 vfprintf.c
> --- libc/stdio/vfprintf.c 16 Jun 2009 17:44:20 -0000 1.75
> +++ libc/stdio/vfprintf.c 6 Nov 2009 19:50:18 -0000
> @@ -159,6 +159,7 @@ static char *rcsid = "$Id: vfprintf.c,v
> #include <sys/lock.h>
> #include <stdarg.h>
> #include "local.h"
> +#include "../stdlib/local.h"
> #include "fvwrite.h"
> #include "vfieeefp.h"
>
> Index: libc/stdio/vfscanf.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdio/vfscanf.c,v
> retrieving revision 1.47
> diff -u -p -r1.47 vfscanf.c
> --- libc/stdio/vfscanf.c 11 Mar 2009 11:53:22 -0000 1.47
> +++ libc/stdio/vfscanf.c 6 Nov 2009 19:50:18 -0000
> @@ -122,6 +122,7 @@ Supporting OS subroutines required:
> #include <stdarg.h>
> #include <errno.h>
> #include "local.h"
> +#include "../stdlib/local.h"
>
> #ifdef INTEGER_ONLY
> #define VFSCANF vfiscanf
>
>
> --
> Corinna Vinschen
> Cygwin Project Co-Leader
> Red Hat
--
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat