This is the mail archive of the archer@sourceware.org mailing list for the Archer project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Python pretty-printers and non-ASCII strings do not play well together :-(


On Tue, Nov 4, 2008 at 4:58 PM, Tom Tromey <tromey@redhat.com> wrote:
> Tom> What should happen here, though?  The string contains invalid
> Tom> characters for its declared (via set target-charset) encoding.
>
> Paul> As an end-user, I would expect something like
> Paul>   $2 = <"\xef\xcd\xab">
>
> It occurs to me I am not completely certain where this error
> originates.  My theory is that it is the call to PyUnicode_Decode in
> valpy_str.
>
> If so, then we aren't seeing a value representation problem, which is
> what I was worried about.  Instead, I think common_val_print is
> emitting a string which is not actually valid according to
> host_charset.  That seems wrong.
>
> We could work around this in valpy_str, I suppose.  But I'm curious to
> know why this is happening -- why isn't common_val_print printing the
> escape sequences itself?
>
> My guess is that the target and host charsets are the same, and
> charset.c is passing character through without checking them for
> validity.  I didn't debug it, but when I set host-charset to ASCII (my
> target-charset is ISO-8859-1), I do see the escapes.
>
> Every time I look at this stuff I'm reminded that the gdb charset code
> could use a good scrubbing.  For example, the default host charset
> ought to come from the locale settings.  I have a patch to implement
> this, but there's no point submitting it since it breaks gdb on
> typical Linux systems -- most people use UTF-8 locales, but gdb
> doesn't handle UTF-8.
>
> Maybe we should just install a smart Python printer for 'char *' ;-)

It seems(!) like the right solution is to make gdb unicode-aware.  It
might mean going with utf8 internally and only converting at the
boundaries, I don't know.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]