This is the mail archive of the
docbook-apps@lists.oasis-open.org
mailing list .
[docbook-apps] Syntax coloring of programlisting
- From: Frans Englich <frans dot englich at telia dot com>
- To: docbook-apps at lists dot oasis-open dot org
- Date: Mon, 8 Nov 2004 18:08:15 +0000
- Subject: [docbook-apps] Syntax coloring of programlisting
Hello all,
I'm thinking about doing syntax highlightning of example code, typically found
in Docbook's programlisting[1] element. I've never seen this in Docbook, only
in web forums and such. Have anyone any experience with this, or is aware of
existing solutions?
I looked around a bit, and the most portable, simplest, solution appeared to
me to be to write templates which marked up the code; XSLT 2.0 have pretty
good support for text-tokenization, judging from my shallow reading of what's
new, compared to 1.0..
The customization layer would add an attribute("code","programming-language"?)
to the programlisting tag, which contained what programming language its
content was, such that the templates could invoke as appropriate.
Alternatively, could the Condition[2] attribute be used.
The template -- with analyze-string, tokenize, and friends -- would then
detect keywords, and mark up as appropriate. This "tokenization" could be
made just as advanced as a compiler, but I think it would be /sufficient/ if
strings and comments were detected(avoided), and then simply did keyword
matching, such that the template didn't have to be context aware of the
language.
The functionality would require XSLT 2.0, but backwards compatibility would
easily be provided by falling back to current behavior.
/What/ to mark up to, is a good question. One possibility is Docbook's
semantical tags(varname, function, parameter etc) which has advantages and
disadvantages:
* It would require a multi-pass. I have no idea how that would cope with the
Docbook XSLs.
* Assuming the point above isn't too much work, it would be simple. It
wouldn't be necessary to write anything extra, since the ordinary Docbook
templates took over once the Docbook markup was done.
* On the other hand it could lead to too little control, since it wouldn't be
possible to distinguish elements in programlisting from those in paraS. For
example, in a programlisting it can be of interest to have a very colorful
editor-like markup, while it would be annoying if function and variable names
in texts were colored in the same way. However, this could be adjusted by
making the templates for the various Docbook tags, detect if they were
children of programlisting, and then generate different class tags(and the
CSS follow appropriate).
Another alternative, instead of generating Docbook elements, is to go directly
on xhtml/fo output. This would mean perhaps a little bit more work, and an
explicit separation of variables, functions, and so forth, between
programlisting and other situations(perhaps using class tags are better,
since then it can be done configurable..).
Perhaps the approriate method is determined by how cumbersome multi-pass is to
achieve.
Another alternative is to bring in programs: KWrite, KDE's text editor can be
told to export an document as per the syntax markup it does, and then could
an XInclude fragment identifier select the parts from the outputted XHTML
Strict document. Another possibility is to use gcc's XML output extension,
and do XSLTs on top of that. But it brings in depedencies, is ugly, and way
too complex, IMO.
I discuss as it would be implemented in the Docbook XSLs, because that can't
hurt; the functionality is of interest for many people.. If anyone have
ideas/corrections to my rambling thoughts, I would with interest read them. I
personally have no real intentions to write it, perhaps as project to learn
XSLT 2.0, whenever that happens.
But then again, perhaps there already exist solutions?
Cheers,
Frans
1.
In my mail, my mentioning of programlisting includes programlistingco, because
I think it is relatively trivial to write templates which puts them under one
roof.
2.
A common attribute, available for free use, such that one doesn't have to
modify Docbook.
http://www.docbook.org/tdg/en/html/ref-elements.html#common.attributes