This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC] Make getenv O(1)
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Florian Weimer <fweimer at redhat dot com>
- Cc: Rich Felker <dalias at aerifal dot cx>, libc-alpha at sourceware dot org
- Date: Fri, 18 Oct 2013 14:40:13 +0200
- Subject: Re: [RFC] Make getenv O(1)
- Authentication-results: sourceware.org; auth=none
- References: <20131014155229 dot GA25159 at domone dot podge> <20131014165411 dot GZ20515 at brightrain dot aerifal dot cx> <20131014172852 dot GA26005 at domone dot podge> <52611F60 dot 3080700 at redhat dot com>
On Fri, Oct 18, 2013 at 01:45:36PM +0200, Florian Weimer wrote:
> On 10/14/2013 07:28 PM, OndÅej BÃlka wrote:
>
> >>In general it's impossible to use anything but linear search because
> >>the application can modify environ[] directly (although it's
> >>questionable whether this is permissible by the standard)
>
> >Linear search applies only when environment variable is not present.
> >When it is present then getting its index in environ and checking that
> >it matches should be O(1).
>
> I think we should aim for a totally different API that also fixes
> the races/memory leaks instead of tweaking the existing
> implementation. Unfortunately, the "secure" API in C11 doesn't
> address our issues.
>
A problem here is with compatibility, what when program uses new
interface but libraries old one.
> Couldn't an earlier entry shadow the entry at the index?
>
Not if it is implemented properly like in following sketch. A
following cache should work if we modify setenv to save environ
capacity at environ[-1].
I made cache self-adjusting. It should handle frequent entries well.
diff --git a/stdlib/getenv.c b/stdlib/getenv.c
index f33c22f..b339cf8 100644
--- a/stdlib/getenv.c
+++ b/stdlib/getenv.c
@@ -22,6 +22,34 @@
#include <string.h>
#include <unistd.h>
+struct env_cache {
+ int idx;
+ size_t len;
+}
+
+struct env_cache envcache[256];
+
+char *
+cache_get (char *s)
+{
+ char *ep;
+ struct env_cache e = envcache[get_hash (s) % 256];
+ if (__environ == lastenv && e.idx > (size_t) __environ[-1])
+ return NULL;
+ ep = __environ[e.idx];
+ if (!strncmp (s, ep, e.len) && ep[e.len] == '=')
+ return ep;
+ return NULL;
+}
+void
+cache_set (char *env, char *s, int idx)
+{
+ int h = get_hash (s) % 256;
+ envcache[h].idx = idx;
+ envcache[h].len = strchr (s, '=') - s;
+}
+
+
/* Return the value of the environment variable NAME. This implementation
is tuned a bit in that it assumes no environment variable has an empty
@@ -33,6 +61,10 @@ char *
getenv (name)
const char *name;
{
+ char *cached = cache_get (name);
+ if (cached)
+ return cached;
+
size_t len = strlen (name);
char **ep;
uint16_t name_start;
@@ -59,7 +91,10 @@ getenv (name)
| (((unsigned char *) *ep)[1] << 8));
#endif
if (name_start == ep_start)
- return &(*ep)[2];
+ {
+ cache_set (name, ep - environ);
+ return &(*ep)[2];
+ }
}
}
else
@@ -84,7 +119,10 @@ getenv (name)
if (name_start == ep_start && !strncmp (*ep + 2, name, len)
&& (*ep)[len + 2] == '=')
- return &(*ep)[len + 3];
+ {
+ cache_set (name, ep - environ);
+ return &(*ep)[len + 3];
+ }
}
}