This is the mail archive of the docbook-apps@lists.oasis-open.org mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[docbook-apps] Syntax coloring of programlisting


Hello all,

I'm thinking about doing syntax highlightning of example code, typically found 
in Docbook's programlisting[1] element. I've never seen this in Docbook, only 
in web forums and such. Have anyone any experience with this, or is aware of 
existing solutions? 

I looked around a bit, and the most portable, simplest, solution appeared to 
me to be to write templates which marked up the code; XSLT 2.0 have pretty 
good support for text-tokenization, judging from my shallow reading of what's 
new, compared to 1.0..

The customization layer would add an attribute("code","programming-language"?) 
to the programlisting tag, which contained what programming language its 
content was, such that the templates could invoke as appropriate. 
Alternatively, could the Condition[2] attribute be used.

The template -- with analyze-string, tokenize, and friends -- would then 
detect keywords, and mark up as appropriate. This "tokenization" could be 
made just as advanced as a compiler, but I think it would be /sufficient/ if 
strings and comments were detected(avoided), and then simply did keyword 
matching, such that the template didn't have to be context aware of the 
language.

The functionality would require XSLT 2.0, but backwards compatibility would 
easily be provided by falling back to current behavior.

/What/ to mark up to, is a good question. One possibility is Docbook's 
semantical tags(varname, function, parameter etc) which has advantages and 
disadvantages:

* It would require a multi-pass. I have no idea how that would cope with the 
Docbook XSLs.

* Assuming the point above isn't too much work, it would be simple. It 
wouldn't be necessary to write anything extra, since the ordinary Docbook 
templates took over once the Docbook markup was done.

* On the other hand it could lead to too little control, since it wouldn't be 
possible to distinguish elements in programlisting from those in paraS. For 
example, in a programlisting it can be of interest to have a very colorful 
editor-like markup, while it would be annoying if function and variable names 
in texts were colored in the same way. However, this could be adjusted by 
making the templates for the various Docbook tags, detect if they were 
children of programlisting, and then generate different class tags(and the 
CSS follow appropriate).

Another alternative, instead of generating Docbook elements, is to go directly 
on xhtml/fo output. This would mean perhaps a little bit more work, and an 
explicit separation of variables, functions, and so forth, between 
programlisting and other situations(perhaps using class tags are better, 
since then it can be done configurable..).

Perhaps the approriate method is determined by how cumbersome multi-pass is to 
achieve.

Another alternative is to bring in programs: KWrite, KDE's text editor can be 
told to export an document as per the syntax markup it does, and then could 
an XInclude fragment identifier select the parts from the outputted XHTML 
Strict document. Another possibility is to use gcc's XML output extension, 
and do XSLTs on top of that. But it brings in depedencies, is ugly, and way 
too complex, IMO.


I discuss as it would be implemented in the Docbook XSLs, because that can't 
hurt; the functionality is of interest for many people.. If anyone have 
ideas/corrections to my rambling thoughts, I would with interest read them. I 
personally have no real intentions to write it, perhaps as project to learn 
XSLT 2.0, whenever that happens.

But then again, perhaps there already exist solutions?


Cheers,

		Frans


1.
In my mail, my mentioning of programlisting includes programlistingco, because 
I think it is relatively trivial to write templates which puts them under one 
roof.

2.
A common attribute, available for free use, such that one doesn't have to 
modify Docbook.
http://www.docbook.org/tdg/en/html/ref-elements.html#common.attributes


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]