On Tue, Jun 07, 2005 at 10:35:48AM -0400, Jairo19@interhosting.us wrote:
I contacted RH and they say they will not change the behavior of their
glibc because they prefer "correctness over performance" and they
believe glibc should be checking for changes to the timezone. Their
suggestions are: build our own glibc and loose RH support, rewrite
mktime for our own use (I tried that but seemed a little complicated and
looses the benefit of the library per se), and lastly they suggested to
set TZ=/etc/localtime (which makes the program run a little faster, but
not as fast as RH7.3).
I certainly can't reproduce contemporary glibc being any slower than
RH7.3 glibc with TZ=/etc/localtime:
$ LC_ALL=C time ~/7.3-glibc/ld-linux.so.2 --library-path ~/7.3-glibc/ /tmp/mktimetest
1.74user 0.00system 0:01.74elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+93minor)pagefaults 0swaps
$ LC_ALL=C TZ=/etc/localtime time ~/7.3-glibc/ld-linux.so.2 --library-path ~/7.3-glibc/ /tmp/mktimetest
1.66user 0.00system 0:01.66elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+93minor)pagefaults 0swaps
$ LC_ALL=C time /lib/ld-linux.so.2 --library-path /lib /tmp/mktimetest
1.94user 13.73system 0:15.68elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+100minor)pagefaults 0swaps
$ TZ=/etc/localtime time /lib/ld-linux.so.2 --library-path /lib /tmp/mktimetest
1.36user 0.00system 0:01.36elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+100minor)pagefaults 0swaps
~/7.3-glibc/ has glibc-2.2.5-34.i686.rpm's /lib dir unpacked, /lib
contains 2.3.5 glibc. All is on a FC3 box, but with TZ=/etc/localtime the
loop does not do any syscalls, so the kernel shouldn't matter much.
Here, glibc 2.3.5 is actually faster than 2.2.5 if TZ=/etc/localtime is
used.
It is widely known that FC1,2,3 and RHEL* are slow systems, and I
always figured it was for the better, then I question, how may of this
Do you have any data to back up your claims?
little changes in glibc exists in RH's patches that by themselves may
not be a big performance hit, but put them together and you get a slower
system ?
You can easily check yourself, each Red Hat glibc src.rpm includes
a tarball with the upstream CVS checkout and a patch. I'm not aware of
any changes from upstream glibc that could noticeably affect performance.
I need to know the weekday for millions of dates, I guess I could use
a hash (I would need to find a library for that) for it, but everything
was fine until I started using RHEL4 with the patched glibc. Any
suggestions ?
I might be missing something obvious, but if all you need is the weekday
for dates stored in the year*10000+month*100+day format and you need
to do it that many times, then surely writing your own function
will be uncomparably faster than doing expensive mktime and localtime.
I haven't put any effort into optimizing this and you can get away
even without mktime calls or just one, not just caching tm_wday
of Jan, 1st for each year. Still, it is more than 60 times faster:
LC_ALL=C time /lib/ld-linux.so.2 --library-path /lib /tmp/mktimetest2
0.02user 0.00system 0:00.02elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+101minor)pagefaults 0swaps
int real_get_dayofweek(int date)
{
// extract day/yr/mn
int day = date % 100;
int month = (date % 10000) / 100;
int year = date / 10000;
struct tm time;
time.tm_sec = 0; time.tm_min = 30; time.tm_hour = 9; time.tm_mday = day; time.tm_mon = month - 1;
time.tm_year = year - 1900;
time_t time1 = mktime(&time);
return(localtime(&(time1))->tm_wday); // Sun=0, Sat=6
} //get_dayofweek
static unsigned char year_to_wday[200];
int get_dayofweek(int date)
{
// extract day/yr/mn
int day = date % 100;
int month = (date % 10000) / 100;
int year = date / 10000;
int leapyear = (year & 3) == 0 && ((year % 100) != 0 || (year % 400) == 0);
static const unsigned short int mon_yday[2][13] =
{
/* Normal years. */
{ 0, 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334 },
/* Leap years. */
{ 0, 0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335 }
};
if (year < 1900 || year >= 2100)
return real_get_dayofweek (date);
return (year_to_wday[year - 1900] + mon_yday[leapyear][month] + day - 1) % 7;
}
void init_year_to_wday (void)
{
int date;
for (date = 0; date < 200; date++)
year_to_wday[date] = real_get_dayofweek(date * 10000 + 19000101);
}
Jakub