This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [open-source] Re: Wish for 2002 ...


 Fri, Jan 11, 2002 at 16:55:51, eggert (Paul Eggert) wrote about "Re: [open-source] Re: Wish for 2002 ...": 

> > These functions were designed to be used in *real* world
> 
> They facilitate sloppy code.  They have little or no redeeming value.
> 
> True, there is a lot of sloppy code in the world -- but we don't have
> to encourage more of it.

No. You missed one important point. Good programmer in good programming
environment doesn't use strcpy/strcat or use them quite rarely in limited
cases; he can use memcpy() instead of strcpy() when length and fitting
in buffer are known (consider Linus' example again;)), he can track first
free position in case of concatenating series, and so on.
But with bad programmer, or bad environment, or some external conditions,
generated code falls to bunches of str*cpy()/str*cat(), often without
overrun checking. "Wide programmer masses" (sorry for marxism template)
wait for a long time an instrument which allows to work with strings
providing following requirements:
1) It really and easily insures from buffer overflow.
2) It really and easily helps to detect string truncating in buffer.
3) Its efficiency isn't rather worse than strcpy/strcat one.
4) The code can be easily converted to use it from strcpy/strcat,
in most cases.

Folks want such instrument, and it appeared. No matters than Linus says
their code is ugly; they don't care of its words, they even don't remember
who is Linus. No matters that you, Paul Eggert, say strlcpy/strlcat isn't
really needed and is bad; they ignore you. No matters that code with bunches
of str*cat() thrashes processor cache; computer hardware is now cheap
and becomes cheaper and cheaper too quickly. As twenty years ago their
choice was the worst programming language in the whole world, named "C",
now they want oak-hard security. strlcpy() and strlcat() are this
instrument in the brain-damaged world of nul-terminated strings. It
exists now, it spreads now wide, commercial vendors include it to their
libraries; and it wasn't invented by you or glibc team, but by OpenBSD team.
Let you please look to truth's eyes: they win, and you lost.
That who woke up earliest, takes common shoes.
The pressure to include strlc*() to glibc, such as current one, will remains
permanently while C and Unix lives, and you should include them or
get used to exist with this pressure.

Come to sinful land from your elephant bone towers. Your glibc isn't written
for you yourselves, or for another gurus; it is written for all.

> > It can be even calculated why Paul Eggert printed the same this
> > piece: it is most ugly in the whole code, and it is the code where
> > strlcat() is used only for paranoia, not for real need.
> 
> I made no special attempt to find ugly code.  I simply did a text
> search of OpenSSH_3.0.2p1 (egrep -n 'strlcpy|strlcat' `find * -type f
> -print|sort`, to be precise) and reported the first use of
> strlcpy/strlcat that I found, where the code was not obviously
> questionable.  This just happened to be line 64 of auth-skey.c.
> The obviously questionable uses of strlcpy/strlcat silently truncate
> data, and this silent truncation could cause the code to misbehave.

And your search method had to find the one ugly place which hasn't real need
to use strlcat() even for amateur. This was caused by your method, not
by openssh code.
Places you call questionable are the places really should be considered
for pluses and minuses of strlc*() use and misuse. Not the scholar example
you showed first time.

> For example, the very first use of strlcpy/strlcat (using the same
> search algorithm as above) is in lines 138-139 of auth-krb4.c, as follows:
> 
> 		strlcpy(phost, (char *)krb_get_phost(localhost),
> 		    sizeof(phost));
> 
> Now, phost is of size INST_SZ, which is 40 (on OpenBSD 2.9 at least; I
> assume other krb4 implementations are similar).  So, if the Kerberos
> ticket-granting instance name is 40 bytes or longer, this code
> silently misbehaves.  Possibly this misbehavior can lead to a security
> hole, and possibly not; I haven't checked.

AFAIR both krb4 implementations (MIT & KTH) can be used by openssh has
shortest tickets, strongly. Whether openssh should protect from possibly
longer ticket from krb4 implementation or from manual root actions? This
isn't good example. Better one should be constructed, IMO, considering
strlcpy/strlcat gathering file paths; there are really suspicious places
in the code.

> I have not done an exhaustive search for problems with strlcpy/strlcat
> in OpenSSH. 

They are not problems with strlcpy/strlcat. They are problems with misuse
of means provided by these functions, and, possibly, misuse of dynamic
buffers.

> I've examined only their first use (where strlcpy is used
> in a questionable way, and for all I know could be exploited by an
> attacker), and on their first use that is not questionable (where the
> strlcat calls are technically inferior to the standard alternatives).
> However, my brief impression from briefly looking at the "egrep"
> output is that these two examples are typical.  If so, OpenSSH
> provides evidence against the use of strlcpy/strlcat.

They are hardly nesessary in openssh. They are useful most elsewhere.

 Fri, Jan 11, 2002 at 23:05:47, aoliva (Alexandre Oliva) wrote about "Re: [open-source] Re: Wish for 2002 ...": 

> > requires programmer to keep real buffer size in sight.
> That's the error in the approach.  As written before, the GNU project
> recommends programs to avoid arbitrary limitations such as fixed-size
> buffers.  So strlcat/cpy are contrary to the GNU way.

Not all buffers are required to be of arbitrary size even with this approach.
Not all buffers are allowed to be of arbitrary size even with this approach.
Consider, e.g., buffer of outgoing NNTP command. One must not make
such command longer than 512 bytes, including terminating "\r\n".
There is no need to allocate million-byte string for NNTP command only
to check that it is too long; buffer of fixed size works here better.
Similarly, one must not overflow 1024-bytes path buffer, unless special
approach to reach longer path is used, as libtwonyeight implements.
There are many such examples.

And even when the case is ready for such GNU recommendation, not all programs
are of GNU. Unless you are Richard Stallman or his fanatical follower,
you don't consider glibc as environment only for GNU software.

> > Unless C were muted to support reliable strings, at least similar to
> > C++ std::string class, programmers will use dangerous strcpy/strcat
> > stuff and make errors in this use.
> If that's what programmers will use, what's the point of adding
> strlcpy/cat? :-P :-D

The point is that with such programming style (considered bad by gurus;)),
strlcpy/strlcat really helps to write correct code.

> > that matters it helps against bugs
> Against buffer overflow bugs, no more.  And it will help introduce
> another class of bugs that fixed-size buffers will never be able to
> overcome.

They don't introduce it. They can be used in such way.
strcpy/strcat can be also used in such way. But strlcpy/strlcat provides
an easy way to check overruns, while being kept in the same "bad" common
way, in contrary to strcpy/strcat which doesn't allow it without direct
memchr() or strnlen() call. And when memchr() detected overflow, it
already happened and the only action allowed to good program is abort().

> One approach I like very much is that of stralloc, that I first saw in
> Amanda code.  Concatenating multiple strings into a newly-allocated
> buffer created with the right size is as simple as stralloc(string1,
> string2, ..., NULL);

asprintf() does the same and with parameter checking, with little drawback
of runtime parsing of "%s%s%s%s%s" ;)

> It's an unfortunate thing that, if the NULL is left out, you'll not
> get a hard error at compile time, and if the chunk of code isn't
> exercised very often, you may end up with an exploitable run-time
> crash.  I still find this better than being connected to
> ahost.mysite.com instead of ahost.mysite.com.br just because the
> implementor chose a 24-byte buffer and used strlcat to concat
> "http://"; to the FQDN host name.

I can't agree with your example because I can't imagine even in nightmare
a reason to choose 24 for URL buffer length.;))


/netch


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]