This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: LC_ADDRESS and other locale updates [was: sr_ME locale country_isbn and int_prefix fix]
- From: keld at keldix dot com
- To: Chris Leonard <cjlhomeaddress at gmail dot com>
- Cc: Jakub Bogusz <jakub at bogusz dot priv dot pl>, libc-alpha <libc-alpha at sourceware dot org>, libc-locales at sourceware dot org
- Date: Wed, 28 Aug 2013 01:33:28 +0200
- Subject: Re: LC_ADDRESS and other locale updates [was: sr_ME locale country_isbn and int_prefix fix]
- Authentication-results: sourceware.org; auth=none
- References: <20130519152227 dot GA20733 at stranger dot qboosh dot pl> <CAHdAataN0GXw5yXxrAR_J+eVckFiM0TDSmKsiaStq7yJ-mj=-w at mail dot gmail dot com> <519D893F dot 70901 at redhat dot com> <20130606081419 dot GA7652 at mail> <51B09452 dot 8040009 at redhat dot com> <20130826185611 dot GA11561 at stranger dot qboosh dot pl> <CAHdAata7caee2Yj7qEJbExVh1L1nn630Ji6YhUWCxRvjYH9WOQ at mail dot gmail dot com> <20130827135844 dot GA32497 at mail> <CAHdAatbdvBRmvneN5V0_mb0OL=SwtWXcipb6WFKJbQF6ZXAjrA at mail dot gmail dot com>
There is a newer locale specification available as
ISO TR 30112, it describes most keywords that we have implemented
in libc. Else the old ISO TR 14652 has most info also.
I think it would be useful to have a full description of
libc - probably based on latest 30112 text.
Best regards
keld
On Tue, Aug 27, 2013 at 06:32:45PM -0400, Chris Leonard wrote:
> On Tue, Aug 27, 2013 at 9:58 AM, Jakub Bogusz <jakub@bogusz.priv.pl> wrote:
> > On Mon, Aug 26, 2013 at 05:25:31PM -0400, Chris Leonard wrote:
> >> I looked at your first patch
> >>
> >> Fill in country_{car,isbn} for aa_ET.
>
>
> >> I am not 100% sure, but I was under the impression from looking at
> >> other locales that country_isbn was a simple numeric value and not
> >> quoted and converted into Unicode points.
> >>
> >> Am I correct in this?
> >
> > I can't see any consistency in (upstream) glibc.
> > Among 45 locales containing country_isbn field:
> > - 23 have numeric value unquoted
> > - 11 have numeric value in quotes
> > - 11 have value coded in Unicode points
> >
> > Also note that some countries have more than one ISBN prefix (there is
> > single case in upstream glibc: es_CR locale). Such case (I think) cannot
> > be represented as unquoted numeric value.
>
> Jakub,
>
> I am sincerely trying to help you land these patches, not just
> nit-picking. I've completed a review of country_car (for example),
> whcih mostly looks fine. When I look at what I think is the same grep
> of the locales dir for "country_ISBN", I see similar numbers, but
> interpret them differently.
>
> The 20-odd numeric only locales includes numerous European
> lang_country pairs, the most heavily scrutinized locales.
>
> The 11 quoted numerics at least suggest that numeric is the way to go,
> and they are primarily minority langs in their specificed countries,
> locales that receive far less review.
>
> The 11 Unicode converted includes 4 without quotes (making it 7 and 4
> and those 4 are all Chinese locales (suggesting the possibility of a
> propagating error).
>
> For me the preponderance of evidence favors unquoted numerics but as
> I originally stated, I am not 100% sure.
>
> Obviously "reading tea leaves" to determine the proper format is an
> inexact art at best. I have previously bemoaned the lack of a
> definitive specification for locales and this example points out the
> urgent need for one.
>
> I would love to have the input of a locale old-timer like Keld or one
> of the developers to make a call on the country_isbn format question
> that we can move forward with. In the meantime, I will continue
> looking over the patches other changes with an eye towards speaking in
> favor of committing them once questions are resolved.
>
> cjl