This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: UCS data encoding in localedata


2012/4/13 Petr Baudis <pasky@ucw.cz>:
> ?Hi!
>
> On Fri, Apr 06, 2012 at 05:37:22AM -0400, Carlos O'Donell wrote:
>> On Fri, Apr 6, 2012 at 4:18 AM, Petr Baudis <pasky@ucw.cz> wrote:
>> > ?Does anyone know the technical reason for using the explicit <U0000>
>> > UCS encoding in localedata instead of some sane approach like UTF8
>> > encoded data? I can think of only historical reasons due to the lack
>> > of support in tools (OS, editors, VCS, ...) in the past, however I
>> > believe that by now, using UTF8 should be fairly safe.
>>
>> Yes, you are probably correct.
>>
>> The only other problem I can think of is that you'd be adding a
>> circular dependency between a tool that uses this data and at the same
>> time edits the data.
>
> ?Hmm, I don't quite follow. You should be able to edit UTF8-encoded
> data even in a non-locale aware editor, I think?

Correct.

> ?I can imagine that working with these patches may be painful, I'm not
> sure how would bugzilla and MUAs deal with all this. ?A good compromise
> would be to write at least ASCII data in plain, I think that would not
> break anything at all? ?We could start the transition gradually with new
> entries using this convention.

My MUA was able to parse UTF-8 data just fine, in fact it made
reviewing the recent Chinese/regexp bug pretty easy, and the test for
that bug uses UTF-8 data right in the test. So you already need a
UTF-8 aware editor to edit *that* testcase.

I'm happy with your suggestion, but I don't know what the next step is
or what infrastructure changes need to happen to enable that next
step.

At a minimum we are going to need a wiki page documenting the tools
available for developers to edit these kinds of files and how to
view/edit/generate the characters within them.

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]