This is the mail archive of the libc-hacker@sources.redhat.com mailing list for the glibc project.
Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
This is from Gary Miller.... gwm@us.ibm.com wrote: > > Mark, > > The distinction between the fprintf family of functions and the fwprintf > family of functions is that the fprintf functions output is byte and the > fwprintf output is wide character (wchar_t). The fwprintf wide character > (wchar_t) data are printed as if having been processed by fputwc(), but the > fundamental difference between the two families of functions is that one is > "byte" oriented and the other is "wide character (wchar_t)" oriented. > > The metric for the output of the fprintf family of functions for > padding, precision, etc. should be bytes. > > The metric for the output (intermediate wide character sequence) of the > fwprintf family functions for padding, precision, etc. should be wide > characters (wchar_t). > > This may seem counter intuitive, but it is the only way to have a rational > distinction of the functions. One of the things that might seem strange is > that one can have a differing number of bytes of printed output for the > fwprintf functions after fputwc() has been applied depending on the > underlying multibyte codeset of the locale: consider the differences > between SJIS and EUC-JP. > > BTW, because of the differences among codesets (SJIS, EUC-JP, EUC-TW, > EUC-CN, 8859-x, UTF-8), I believe that it is a bad idea to attempt to apply > precision operations on character (byte) strings. Precision operations on > wide character strings should have "consistent" visual output -- the > underlying data buffers will have varying numbers of bytes due to the > differences in codesets after the data has been processed by fputwc(). > > Gary W. Miller Phone: ( 512) 838-8297 > IBM 2BCA/903 ZIP 9350 T/L: 678-8297 > 11400 Burnet Road FAX: (512) 838-0169 > Austin, Texas 78758 Internet: gwm@us.ibm.com > > Mark Brown/Austin/IBM@IBMUS on 09-27-2000 11:04:20 PM > > Please respond to Mark Brown/Austin/IBM@IBMUS > > To: Gary W Miller/Austin/IBM@IBMUS > cc: > Subject: [Fwd: reopening fprintf I18N issue] > > Gary > > FYI > > Mark > > Ulrich Drepper wrote: > > > > Sorrya for the late reply, I'm still catching up. > > > > > This is an incorrect interpretation. The printf() class of functions > > > is always _byte_based_; a char is a byte in ISO C. Note that there > > > is no "l" (ell) qualifier present (SUSv2), thus the argument to %s > > > is to be a pointer to an array of char (ISO C) -- this is because it > > > is going to be treated as bytes. Note the "if precision....that many > > > bytes are written" sentence. > > > > I believe you that this is what current implementations do because it > > is what one expects from a non-locale-aware implementation. > > > > > As to what is going on in the test results, let me add something: > > > > > > [FAIL] printf([%-6.1s],??????) > > > sys[[<SPC><SPC><SPC><SPC><SPC><SPC>]] != exp > [[<SPC><SPC><SPC><SPC><SPC>]] > > > ^ > > > There should be an > > > undisplayable single > byte > > > here if you look at > the > > > actual output! > > > > There is none in the files I got and this is good so. The problem is > > that if this byte would be there the entire output is unusable. I > > just changed the code to implement it the way you suggest it and now I > > cannot even use iconv anymore. I cannot imagine that this is what > > people want to use. > > > > This leaves in my opinion only two ways out: > > > > - just like the test output I have, the byte is simply omitted. This > > has the big drawback that now the output precision is not honored in > > some case and string concatenation etc might fail because junk > > characters are included in the string. > > > > - do it the way I've implemented it. It always provides a usable output. > > > > I do not really know what to do. Writing out garbage bytes seems much > > worse than diverging from the behavior of other implementations. > > > > > > [FAIL] printf([%-6.3s],??????) > > > > sys[[?<SPC><SPC><SPC><SPC>]] != exp[[?<SPC><SPC><SPC>]] > > > ^ > > > a single byte here as well. > > > > Neither here is this byte present. I guess your Japanese guys are > > agreeing with me that this additional byte is bad. > > > > > Now, onward to the swprintf() issue. Gary thinks the spec here is > horribly > > > muddled, and that both the test and glibc are doing the wrong thing. We > are > > > going to submit an aardvark to Austin Group on this. For what it is > worth, > > > glibc is closer the Gary's expected behavior. > > > > I've got meanwhile some comments from the original author of the amd1 > > specs. His intentions were a bit different from what I had > > implemented and this is I've changed now. I think my implementation > > is now in line what ISO C99 is intended to be. > > > > -- > > ---------------. ,-. 1325 Chesapeake Terrace > > Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA > > Red Hat `--' drepper at redhat.com `------------------------ > > -- > Mark S. Brown > bmark@us.ibm.com > Senior Technical Staff Member 512.838.3926 > T/L678.3926 > IBM RS/6000 AIX System Architecture Mark > Brown/Austin/IBM > IBM Corporation, Austin, Texas -- Mark S. Brown bmark@us.ibm.com Senior Technical Staff Member 512.838.3926 T/L678.3926 IBM RS/6000 AIX System Architecture Mark Brown/Austin/IBM IBM Corporation, Austin, Texas
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |