This is the mail archive of the
gdb@sourceware.org
mailing list for the GDB project.
Re: printing wchar_t*
> From: Vladimir Prus <ghost@cs.msu.su>
> Date: Fri, 14 Apr 2006 10:01:57 +0400
>
> > What character set is used by the wide characters in the wchar_t
> > arrays? GDB has some support for a few single-byte character sets,
> > see the node "Character Sets" in the manual.
>
> Relatively safe bet would be to assume it's some zero-terminated character
> set. I plan to assume it's either UTF-16 or UTF-32 in the GUI (the
> conversion code is the same for both encodings), but gdb can just print raw
> values.
We should get our terminology right: UTF-16 is not a character set,
it's an encoding (and a multibyte encoding, btw). As for UTF-32, I
don't think such a beast exists at all.
I think you meant 16-bit Unicode characters (a.k.a. the BMP) and
32-bit Unicode characters, respectively.
> > It's one possibility, the other one being to call a function in the
> > debuggee to produce the string.
>
> And what such a function will return? char* in local 8-bit encoding? In that
> case, no all wchar_t* variable can be printed.
If you want to display non-ASCII strings, it means you already have
some way of displaying such characters. The function I mentioned
would not return anything, it would actually _display_ the string.
For example, in command-line version of GDB, if the terminal supports
UTF-8 encoded characters, that function would output a UTF-8 encoding
of the non-ASCII string, and then the terminal will display them with
the correct glyphs.
> > Yet another possibility is to do the
> > conversion in your GUI front end.
>
> That's what I'm going to do, but first I need to get raw data, preferrably
> without issing an MI command for every single character.
A wchar_t string is just an array, and GDB already has a feature to
produce N elements of an array. In CLI, you say "print *array@20" to
print the first 20 elements of the named array.