This is the mail archive of the
gdb-patches@sources.redhat.com
mailing list for the GDB project.
Re: RFA: start of i18n
- From: Tom Tromey <tromey at redhat dot com>
- To: Andrew Cagney <ac131313 at cygnus dot com>
- Cc: gdb-patches at sources dot redhat dot com
- Date: 21 Jun 2002 12:02:22 -0600
- Subject: Re: RFA: start of i18n
- References: <87elf0wnp2.fsf@fleche.redhat.com> <3D1360ED.6030201@cygnus.com>
- Reply-to: tromey at redhat dot com
>>>>> "Andrew" == Andrew Cagney <ac131313@cygnus.com> writes:
Andrew> I guess your stepping up to take the bat on this one (a nice
Andrew> suprise).
Well, I'm not going to commit to it. But I thought I could make some
useful progress even if I can't do it all.
Andrew> Can you perhaphs sketch out exactly what internationalizing
Andrew> GDB is going to entail? So people, including me, can
Andrew> understand what we can expect to be comming our way.
Sure, no problem.
There are many places in gdb that could be made i18n-aware. This
patch only addresses the start of one of them: translating the
messages printed by gdb into the user's native language. The idea
here is that "help foo" would print the help in German, or French, or
Farsi, according to the locale.
In the GNU world this is done with a message catalog tool called
gettext. Gettext works using source markup of the actual text strings
(some tools use tokens or numbers the gettext manual has a nice
rationale for why its approach is superior).
There are basically 3 parts to gettextizing an application:
* The code infrastructure (this patch)
* The build infrastructure. This would be the omnipresent `po'
directory, `.pot' files, and accompanying Makefile hacks. I haven't
done this part for gdb yet. You can look in bfd/, gas/, etc, to see
how this is done. (Actually I suppose there are several plausible
ways to do it, but for gdb it seems easiest to follow what the rest
of the binutils do.)
* Marking all the strings. This is the bulk of the work. The idea
here is that whenever gdb prints a message:
printf ("Hi\n");
You must wrap the string so that (1) xgettext can extract the string
and put it into the message catalog, and (2) at runtime the string
will be translated:
printf (_("Hi\n"));
In gdb we'll be using the `gettext' function to do this
translation. But it is typical in GNU programs to define `_' as a
macro instead. This is shorter and doesn't confuse the source as
much. (My patch adds this macro.)
I haven't investigated this particular issue in gdb in too much
depth. So for instance I don't know how marking will or will not
affect MI.
I do know, from the last time I looked at gettextizing gdb (that
would have been April 98) that parts of gdb aren't i18n-safe. For
instance, add_show_from_set is not. This means that as part of the
work marking strings, some code changes will have to be made.
(Or something else will have to be done; for instance you could
imagine an ugly implementation where we have a translated sed
expression that is run over the `set' help text. I don't know
whether that would actually work though.)
Exactly what changes will need to be made is hard to predict.
If you look at the ChangeLog-9899 for bfd/gas/ld/..., you can see
what I had to do for those. Generally speaking it isn't very much,
just reworking computed printf()s and stuff like that. For
instance, something like this isn't safe:
printf ("No such %s %s\n", is_var ? "variable" : "macro", name);
That's because in other languages things may need different
orderings, or different endings in different contexts, etc. So here
you'd typically rewrite this as an `if' and two `printf's.
There's a long section on this in the gettext manual that explains
it well.
Maintenance-wise there are also some things to be aware of.
Only the first one is difficult.
* When new strings are added, they must be marked.
This is a coding style change and unfortunately isn't really
automatable.
* You have to regenerate the primary message catalog whenever strings
are added, changed, or deleted.
* You have to periodically send the master catalog to the appropriate
site so that translation teams can work on it. I know some projects
(Gnome) try to do a string freeze before a release so that the
translation teams can catch up.
* You have to periodically incorporate new translation catalogs from
the translation teams into the source repository.
A couple other ways gdb could be made i18n-aware, plus my comments:
* gdb could be aware of the encoding of strings in the inferior, and
convert to the correct output encoding when printing. In some
situations this might be nice. I don't plan to look at this.
* In theory gdb commands could be translated. However, I think there
is some consensus that in general translating commands and arguments
is too difficult in practice. I don't think anybody is doing this,
so for now I think gdb should not either. (For instance: program
names and command-line arguments aren't translated, Emacs function
names aren't translated.)
Tom