This is the mail archive of the docbook@lists.oasis-open.org mailing list for the DocBook project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[docbook] Re: whitespace at the beginning and the end of element content


/ Wolfgang Jeltsch <wolfgang@jeltsch.net> was heard to say:
| Am Sonntag, 31. Oktober 2004 17:45 schrieb Norman Walsh:
[...]
| Oh, that's bad news.  How do I format the source code of a longer paragraph 
| then?  This way?
|
|     <para>This is an attempt
|       to format a longer paragraph
|       without getting problems
|       with whitespace.</para>

Well, more like this:

<para>This is an attempt
to format a longer paragraph
without getting problems
with whitespace.</para>

| But this looks ugly, IMO.  The way I formatted the paragraph in my previous 
| mail (see above) seems much more natural to me.

More natural, perhaps, but those extra spaces are in your document.

| And even if I format the paragraph without whitespace after the start tag and 
| before the end tag, how can I be sure that linebreaks and the spaces used for 
| indenting don't appear in the output?

I don't know of any processing system that doesn't treat a newline like
a space (outside of verbatim environments, etc.) so they're ok. The indents
are going to be in your content.

Now, for HTML, it doesn't matter (extra spaces don't matter in HTML)
and the same may be true for FO, I haven't gone to check.

| Section 2.10 of "Extensible Markup Language (XML) 1.0 (Third Edition)" is very 
| vague.  It speaks about white space that is used "to set apart the markup for 
| greater readability".  It says about this kind of whitespace: "Such white 
| space is typically not intended for inclusion in the delivered version of the 
| document."
|
| But who decides which whitespace shall be considered whitespace used to set 
| apart the markup?  Is whitespace appearing immediately after a start tag or 
| immediately before an end tag considered such whitespace or not?  Does the 
| answer to this question depend on the document type?

Yes. In "element content" whitespace is insignificant. In mixed
content (anywhere text can appear) it is significant (to the XML
parser). Different applications have different rules, of course.

| And what do they mean with the "delivered version"?  Do they mean something 
| like a PDF file or do they mean another XML file which is produced from the 
| original XML file by doing things like stripping whitespace?

That paragraph of the spec is a little informal. I think it means any
interpretation of the document.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com>      | Everything should be made as
http://www.oasis-open.org/docbook/ | simple as possible, but no simpler.
Chair, DocBook Technical Committee |

Attachment: pgp00000.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]