This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/3405] sort order on pt_BR


------- Additional Comments From email_daniel_h at yahoo dot com dot br  2010-06-27 14:19 -------
Hi, everybody. First of all i apologize for my poor writing skills. English is
not my native language.

pt_BR sort order seems odd to me. If this behavior is not a bug, i agree with
Keld's suggestion: To define a new locale, like pt_BR@abnt, using the "right"
sort order.

Can the reorder sample sentence handle lower and uppercase properly? The result
of a sort, without the suggested change in the locale definition file, can't:

LC_ALL=pt_BR LANG=pt_BR LANGUAGE=pt_BR sort a.txt 
gabriela heleda de souza
GABRIELA HELEDA DE SOUZA
gabriela jacoby nos
GABRIELA JACOBY NOS
gábriela jacoby nos
GÁBRIELA JACOBY NOS 
gabriel alcides klim perondi
GABRIEL ALCIDES KLIM PERONDI
gábriel alcides klim perondi
GÁBRIEL ALCIDES KLIM PERONDI
gabriela leticia batista nunes
GABRIELA LETICIA BATISTA NUNES
gabriel alexandre da silva manica
GABRIEL ALEXANDRE DA SILVA MANICA


The expected output:
gabriel alcides klim perondi
gábriel alcides klim perondi
gabriel alexandre da silva manica
gabriela heleda de souza
gabriela leticia batista nunes
gabriela jacoby nos
gábriela jacoby nos 
GABRIEL ALCIDES KLIM PERONDI
GÁBRIEL ALCIDES KLIM PERONDI
GABRIEL ALEXANDRE DA SILVA MANICA
GABRIELA HELEDA DE SOUZA
GABRIELA LETICIA BATISTA NUNES
GABRIELA JACOBY NOS
GÁBRIELA JACOBY NOS 


This is "tricky" because we don't just perform a lexicographically comparison of
each character (a Portuguese Java user will be happy to know that
String.compareTo is not enough to produce the sorted result that he expect, for
several reasons).
We first sort ignoring accented letters, then we use them as a
"tiebreaker/disambiguation criteria" (i don't know the correct term in English)
between equal full names. In the first step, a = á, but in the later step, a < á.


Well, that is all i know. I will try to get a copy of the Norma NBR 6033:1989
(NB 106) from ABNT to confirm (or not :-)) these examples.

Thanks.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=3405

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]