This is the mail archive of the kawa@sourceware.org mailing list for the Kawa project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Encoding and unescaped-data


I'm running into a bit of unexpected behavior. I've been able to solve
most of my encoding problems with help from the list archives, but I'm
not sure yet how to get around this one.

My application loads a UTF-8, HTML file and simply sends it to the
server as unescaped-data. If I send it without unescaping the data,
all special characters are handled correctly; although, the HTML
remains escaped, of course.

But all special characters seem to be sent as question marks if I use
unescaped-data.

I've attempted to distill the essence of the problem below. I've
simply used a string with the correct bytes, which exhibits the same
behavior.

--------

;; -*- scheme -*-

(define (bytes->string/utf-8 bytes)
  (<string> (<java.lang.String> bytes
                                0
                                bytes:length
                                "UTF-8")))

(let* ((data (string-append "<b>"
                            (bytes->string/utf-8 (<byte[]> 206 187))
                            "</b>")))
  (values-append data ", " (unescaped-data data)))

#\newline

--------

The page source output is:

&lt;b&gt;&#x3BB;&lt;/b&gt;, <b>?</b>

But I expected:

&lt;b&gt;&#x3BB;&lt;/b&gt;, <b>&#x3BB;</b>

or maybe:

&lt;b&gt;&#x3BB;&lt;/b&gt;, <b>λ</b>



Is there a way to get the desired output?

Thanks,
-- Daniel Terhorst

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]