This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: setenv problems


Hi!

For the fact, there are other applications that use newlib, or even
ports of newlib to other platforms that incorrectly use putenv(), like
sending stack pointers, or sending allocated pointers and freeing them
after the putenv() call. However, I believe that the reason API is
used incorrectly is not a good reason to fixing the implementation to
do the right thing. newlib supposedly should provide help for porting
existing applications to embedded platforms. If an application depends
on putenv() (or whatever else) to work correctly, porting would
involve fixing code and tinkering with application logic. Whereas if
an application, or a port, is developed against newlib, and uses the
APIs incorrectly, developers should expect changes like that.

Thanks,
  Pawel.

On Tue, Sep 23, 2008 at 9:05 AM, Howland Craig D (Craig)
<howland@lgsinnovations.com> wrote:
>
>     While the general topic of problems with setenv() and getenv() is
> being discussed, I think that I might as well mention another related
> problem that I noticed a few months ago.  That is that putenv() does not
> comply with the POSIX definition for it.  Refer to
> http://www.opengroup.org/onlinepubs/009695399/functions/putenv.html
>     Specifically, putenv() makes a copy of the string given to it to
> add to the environment rather than entering the given pointer to the
> environment vector as required.  (The definition does not use those
> exact words, but it is what is needed:  "the string pointed to by string
> shall become part of the environment, so altering the string shall
> change
> the environment.")  The copy is done indirectly because putenv() ends
> up calling _setenv_r(), which is defined to put copies of the strings
> into
> the environment.
>     According the the change log, this incorrect behavior has been
> there
> since the beginning (as is also implied by the copyright notices without
> added comments as to the changes).  A question arises as to whether the
> behavior ought to be changed to match POSIX or if it should be left as
> it is in case people are counting on what it has been doing for years,
> in which case a note could be added to the man information pointing out
> the discrepancy.
>     There is also one more item.  I noticed when looking over the code
> related to the problems reported by Pawel Veselov that started this
> chain
> that the implementations have a memory leak problem, too.  _unsetenv_r()
> deletes pointers from the environment, but it does not free the memory
> associated with the variable being deleted.  At the moment this is the
> right behavior as there is nothing to track whether each entry has been
> malloced or not.  If one were to assume that only setenv() were used to
> create the environment, then free() could be used--given the present
> incorrect implementation of putenv().  But if putenv() were to be fixed,
> then the present not-free approach is the only valid one unless
> additional
> book-keeping were added.
>     It does seem like the problems that Pawel points out ought to be
> fixed.  The ones that I'm pointing out are trickier, making me wonder if
> the sleeping dog should be left to lie or not.  (As I pointed out above,
> there's a question of the paper definition versus the definition of how
> it has been working.  I have not done any tests to see what actually is
> done by some of the systems that I have access to.)  What approach ought
> to be taken?  (I could supply patches for as many of the problems in
> this
> chain as needed, if need be.)
>                                Craig Howland
>
> -----Original Message-----
> From: newlib-owner@sourceware.org [mailto:newlib-owner@sourceware.org]
> On Behalf Of Jeff Johnston
> Sent: Monday, September 22, 2008 3:13 PM
> To: Pawel Veselov
> Cc: Joel Sherrill; newlib@sourceware.org
> Subject: Re: setenv problems
>
> Pawel Veselov wrote:
>> On Mon, Sep 22, 2008 at 10:18 AM, Joel Sherrill
>> <joel.sherrill@oarcorp.com> wrote:
>>
>>> Hi,
>>>
>>> It is not directly stated on the getenv() page at opengroup.org but
> is
>>> in the section on environment variables that '=' is not to appear in
>>> an environment variable name.
>>>
> http://www.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap08.html
>>>
>>>> These strings have the form /name/=/value/; /name/s shall not
> contain the
>>>> character '='. For values to be portable across systems conforming
> to IEEE
>>>> Std 1003.1-2001, the value shall be composed of characters from the
> portable
>>>> character set (except NUL and as indicated below). There is no
> meaning
>>>> associated with the order of strings in the environment. If more
> than one
>>>> string in a process' environment has the same /name/, the
> consequences are
>>>> undefined.
>>>>
>>
>> http://www.opengroup.org/onlinepubs/000095399/functions/setenv.html
>> says that setenv() should fail with EINVAL if name contains an equal
>> sign.
>>
>>
> Ok, also considering the linux man pages state the same and the function
>
> isn't specified by ANSI.  Would you like to try your hand at a patch?
>
> -- Jeff J.
>>> I don't see any requirement on what getenv() should
>>> do if the name string contains an equal. The case where
>>> the first character is '=' could just as easily be interpreted
>>> as an empty name string and thus an error.
>>>
>>
>> getenv() with the name that contains an equal sign would inadvertently
>> fail by returning an empty string because no name can contain an equal
>> sign, as there is no specification of an alternative behavior in such
>> case, the string can be interpreted literally in all cases (at least
>> it's allowed to). However, the only problem with getenv() that I found
>> that it would succeed, in case it both contains an equal sign, and the
>> character sequence before the equal sign contains a string that exists
>> in the environment, which I believe is improper.
>>
>> Regarding the "first character being an equal sign" issue, the only
>> problem with that is that for some reason setenv won't accept values
>> with such characters (well, it will accept them, but will eat up first
>> and only first equal sign).
>>
>>
>>> As far as I can tell the behavior is undefined.
>>>
>>> Jeff?
>>>
>>> --joel
>>>
>>> Pawel Veselov wrote:
>>>
>>>> Hi,
>>>>
>>>> while looking through the cegcc project, I discovered a few issues
>>>> with the setenv() (_setenv_r) and getenv() (_getenv_r) functions:
>>>>
>>>> 1. In the beginning of the function, the pointer to the value string
>>>> is shifted if the string starts with '='. The comment says that is
> to
>>>> prevent values to start from '='. I couldn't find anywhere in the
>>>> definition of setenv() that a value may not start with equal
>>>> character. Any reason for eating first equal character up?
>>>>
>>>> 2. The setenv() man pages seem to ask for returning EINVAL in case
>>>> there is an equal character inside the name of the variable. It also
>>>> says that glibc versions do allow to have environment variable names
>>>> with equal signs in them, however, the current implementation of
>>>> environment variables in newlib doesn't seem to be able to
> distinguish
>>>> between those. (So, if you set variable named (a=b) with value (c),
>>>> the getenv for (a) will return (b=c)). So I believe newlibs setenv
>>>> should return EINVAL.
>>>>
>>>> 3. The getenv() implementation searches for the variables that have
>>>> the name as passed, unless the name contains an equal sign in it, in
>>>> which case, the only part that is searched for is the characters
>>>> before the equal sign. So if you call setenv("foo", "bar", 1) and
> then
>>>> call getenv("foo=grape"), you will get "foo=bar" as the result of
> that
>>>> getenv call.
>>>>
>>>> Since cegcc project uses newlib for base library support, I'm
>>>> cross-posting to both lists. Would appreciate any comments on the
>>>> above. I can produce a patch that restricts both functions to POSIX
>>>> behavior.
>>>>
>>>> Thanks,
>>>>  Pawel.
>>>>
>>>>
>>>> --
>>>> With best of best regards
>>>> Pawel S. Veselov
>>>>
>>>>
>>> --
>>> Joel Sherrill, Ph.D.             Director of Research & Development
>>> joel.sherrill@OARcorp.com        On-Line Applications Research
>>> Ask me about RTEMS: a free RTOS  Huntsville AL 35805
>>>  Support Available             (256) 722-9985
>>>
>>>
>>>
>>>
>>
>> Thanks,
>>   Pawel.
>>
>
>



-- 
With best of best regards
Pawel S. Veselov


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]