This is the mail archive of the
docbook-apps@lists.oasis-open.org
mailing list .
[docbook-apps] DocBook XML example projects posted. One formatted with FrameMaker 7.0. One formatted with DocBook XSL.
- From: Steve Whitlatch <swhitlat at getnet dot net>
- To: docbook-apps at lists dot oasis-open dot org
- Date: Fri, 19 Mar 2004 20:23:24 -0700
- Subject: [docbook-apps] DocBook XML example projects posted. One formatted with FrameMaker 7.0. One formatted with DocBook XSL.
- Organization: Steve Whitlatch, Inc.
- Reply-to: swhitlat at getnet dot net
Hello,
I've used the DocBook XML DTD (version 4.2) and a valid DocBook XML
book to create an example project using FrameMaker 7.0. This project
and a DocBook XSL counterpart project using the DocBook XML 4.3RC3 DTD
(including a sample XSL customization layer) are available at:
http://www.getnet.net/~swhitlat/DocBook/docbook_section.html
The FrameMaker example project and the XSL-formatted example project
use the same DocBook XML book, except the XSL-formatted version uses
SVG versions of the vector graphics. The FrameMaker version uses EPS
graphics. Neither project is perfect. In fact both have bugs and
deficiencies. The XSL-xsltproc-FOP formatting project has glaring bugs.
Nonetheless, both projects make good learning material for
beginners. I hope that the bugs in each project can eventually be worked
out and I invite those who have solutions to send me corrections. Over
time, these example projects could become good examples of how to
correctly format DocBook XML books, one with DocBook XSL and open-source
tools and one with structured FrameMaker 7.0.
The FrameMaker project includes the original DocBook XML plus graphics,
a formatting template, the EDD, the rules file, the FrameMaker book file,
all FrameMaker book components, and the structapps.fm file in use on my
system. I cannot provide the custom import/export client or its code due
to Adobe copyrights. However, the custom client code is identical to the
code Adobe ships with its DocBook Starter Kit. So, if you have
FrameMaker 7.0, you already have the custom import/export client
(docbook.dll) and its source code.
The XSL-formatted project contains the XML plus graphics, a DocBook XSL
customization layer, and a tiny script I use to control the production
process.
Notes from the XSL-formatted project are mixed in with the XSL, where they
are most useful. The XSL can be viewed online at:
http://www.getnet.net/~swhitlat/DocBook/XSL_Project_XSL.html
From the FrameMaker example project's README (long)
********************************************
Suggested use:
FrameMaker's XML DocBook Starter Kit provides files you can experiment
with, but it does not provide any XML to experiment with round-tripping
and it does not provide an example of how you would create a FrameMaker
book from an imported DocBook XML book. This example package provides
example material where the DocBook Starter Kit falls short. If you are
new to structured FrameMaker (I am), you might benefit by examining the
read/write rules, the EDD, or the structapps.fm file. You might experiment
with importing or exporting the XML (check the notes I provide below). Or,
you might try structured authoring in FrameMaker to see what that is
like using an already-defined, working structured application. I think
the most value one can gain by experimenting with this example project
comes through seeing structured FrameMaker's quirks and limitations. In
particular, it is useful to see 1) its DocBook XML import/export
limitations without a custom client other than the one provided by
Adobe. 2) the insane inefficiency of structured FrameMaker's overall
structured application development process (you must re-import the XML
and re-create the FrameMaker book every time you wish to see the affects
of new/changed element-paragraphFormat associations in the EDD). 3) How
FrameMaker-generated lists don't fit in with the round-tripped XML. To
see these things, you need a FrameMaker book produced from a valid DocBook
XML book. That is what this example project provides.
Notes:
I am just a beginner with structured FrameMaker. I have made every
effort to be accurate in the notes below; however, it is possible that
anything stated below could be wrong. In fact, I've said enough in
these notes that it's likely something I said is wrong. Please correct
me if I am wrong. I can be reached at swhitlat@getnet.net.
Creating an EDD from the DocBook XML DTD.
Pretty much just a point and click process. Afterwards, it was
necessary to change the FrameMaker element types for many of the
elements. I mostly just imitated what was done in the FrameMaker
DocBook Starter Kit's EDD. Note that when validating XML within
FrameMaker, FrameMaker does not adhere to exactly the same criteria
for determining validity as do most of the usual XML tools such as
xmllint, nsgmls, etc. When FrameMaker validates a document, it
validates it in FrameMaker's own special way against the document's
EDD. Depending on the modifications one makes to the EDD, a structured
FrameMaker document may not validate against its original DTD upon
export to XML. Thus, I made every effort to keep the EDD's structure
synchronized with the original DTD, which is simply not possible in
every case. For example, FrameMaker does not support nested tables,
so the "entrytbl" element had to be dropped; the "videoobject" element
had to be dropped; etc.
Inclusions/Exclusions.
FrameMaker EDDs allow the use of a content model concept not available
to XML or SGML DTDs called inclusions/exclusions. An "inclusion" allows
a child element to appear anywhere within a parent element, including
all of the parent element's child elements. An "exclusion" excludes an
element from appearing anywhere within some parent element, including
all of that element's child elements. I avoided inclusions/exclusions
because I expect they would cause trouble when attempting to
round-trip the XML.
Read/Write Rules.
As I did with the EDD, I imitated most of what I found in the DocBook
Starter Kit's rules file. Rules files do not accommodate
context-sensitive, fine-tuned import/export rules. For example, an
import/export rule that applies to a "title" element applies to all
"title" elements regardless of a "title" element's parents. Thus the
rule applies equally to a "title" element within a "table" element,
a "figure" element, a "book" element, a "chapter" element, etc. When
associating paragraphFormats with elements in the EDD, one can specify
context-sensitive rules, but not in the import/export rules file. To
use context-sensitive import/export rules, one must write a custom
client using the FrameMaker Developer's Kit (FDK) and the FrameMaker
Structured API (free download from Adobe).
Books.
For each main component in a DocBook XML book, such as "chapter"
elements, the FrameMaker EDD allows one to specify a child element
"ValidatHighestLevel". Upon import of DocBook XML, each DocBook
element corresponding with an EDD element that has a
"ValidatHighestLevel" child element becomes a separate file in the
FrameMaker book. FrameMaker converts each element and all of that
element's child elements into a single FrameMaker file. DocBook "toc"
elements, "index" elements, and other DocBook elements that correspond
with FrameMaker generated lists are created as separate files, but
those files cannot be used as FrameMaker generated lists. FrameMaker
TOCs, Indexes, etc., must be generated manually in the usual FrameMaker
way, they must be regenerated every time one re-imports the DocBook XML,
which is every time one chooses to see the affects of new/changed
element-paragraphFormat associations in the EDD, because that is the
only way to get the EDD formatting instructions to show in the WYSIWYG
windows.
Formatting, element-paragraphFormat associations.
Formatting can be applied in different ways. Elements can be
associated with FrameMaker paragraph formats in the EDD; elements can
inherit formatting from parent elements (specified with complex rules
in the EDD); elements can be individually formatted manually in
FrameMaker; any mix of the preceding can be used. As much as possible,
I chose to apply all formatting through element-paragraphFormat
associations in the EDD. The actual paragraph formats are defined in a
template file specified in the application definition in the
structapps.fm file.
Applying Formats.
To get the element-paragraphFormat associations defined in the EDD
to take affect, I had to continually re-import the XML and wait for
FrameMaker to convert the XML to a FrameMaker document (actually a
group of documents constituting a FrameMaker book). This means I had
to re-create the entire book in FrameMaker each time I wanted to see
what a format change in the EDD would look like in the formatted
document. There. I've said this twice now because it deserves being
said twice. It is a _very_ big negative for FrameMaker.
Generated Lists.
Generated lists are not automatically created upon DocBook XML import.
Creating them takes a bit of time, which would not be so bad if it
needed to be done only once. But that's not the way it is. While
developing import/export rules and customizing element-paragraphFormat
associations in the EDD, I found it necessary to re-import the XML in
order to effect formatting changes in the FrameMaker documents that
make up the book (said three times now). Each time the XML is
re-imported, FrameMaker creates a _new_ book. Any generated lists
(TOC, LOF, LOT, Index, etc.) must be manually re-created and included
in the book. Obviously, it's a good idea to leave generated lists and
anything else that needs manual formatting (such as, perhaps, a title
page) until the structured application is complete. But when is that?
It seems "completion" is always premature, and the structured
application development process is naturally iterative. One is
constrained to FrameMaker's structured application design methodology,
and so there is no way out of this pain when attempting to work with
a DocBook XML book in structured FrameMaker.
Custom clients.
Not all of FrameMaker's XML import/export behavior can be controlled
with just a rules file. I inspected some of the code Adobe provides
in the DocBook XML Starter Kit's custom client and recompiled it.
Adobe provides a Visual Studio project and workspace files along with
the source code for the custom client in the Starter Kit. All I had to
do to get it to compile was to include project/workspace paths to the
FDK header files and struct.lib. Using an Intel 7 compiler, the
resultant file, docbook.dll, was about %60 the size of the docbook.dll
shipped with FrameMaker 7.0. Each source file contains a big,
restrictive copyright statement at its top. As for the code, it's not
easy to follow. The FrameMaker Developer Kit (FDK) uses an abstraction
layer wherein the usual data types (int, etc.) are redefined and
renamed. I expect this helps Adobe to use a single code base for
FrameMaker while allowing the code to compile on multiple platforms.
But as I'm already not a C programmer, the abstraction layer just made
the code more confusing.
Element order.
Without a custom client, the order in which the elements occur in the
XML is the order in which they appear in the formatted FrameMaker
documents. For example, to get a figure title to appear below the
figure in a formatted, structured FrameMaker document requires
programming a custom client using the FDK and the FrameMaker
Structure Import/Export API. Like I sad, it's all downloadable from
the Adobe web site, including literally thousands of pages of
documentation and reference material.
Hiding element text.
To hide the text of some elements, I declared those elements in the EDD
to be FrameMaker "Marker" elements of type "CustomMarker". Then I left
the "CustomMarker" name blank in the EDD.
Whitespace/Empty paragraphs.
I have no explanation for the few seemingly random uninvited
whitespaces that occasionally appear inline. Nor do I know why many
paragraphs appear with empty whitespace between themselves and the
next paragraph, while others do not. It may be something I have done,
or it may be something FrameMaker does, or both. But he XML is valid
upon input and the DocBook XSL stylesheets do not produce the strange,
unwanted whitespace. Whatever the case, correcting the fomatting
required quite a bit of manual involvement. Sometimes, an empty
paragraph could be used for vertical spacing, but mostly I removed them.
SVG.
FrameMaker 7.0 rendered all my SVG graphics as low-quality raster
images. So, I converted the SVG graphics to EPS and used those instead.
Indexterms.
Upon exporting the XML out of FrameMaker, all <indexterm> opening tags
get closed. Usually I want something like:
<indexterm><primary>text text text</primary></indexterm>, but FrameMaker
ouputs <indexterm/><primary>text text text</primary></indexterm>. Notice
that the opening tag is closed.
Xrefs.
After creating the EDD, I changed the FrameMaker element type for
xref from "Container" to "Object", deleted its content model, and added
an association with a default FrameMaker cross-ref paragraph format,
all of which is, I believe, consistent with the EDD for FrameMaker's
DocBook XML Starter Kit EDD. For import/export rules, I used the same
read/write rule for "xref" taken from the DocBook Starter Kit's rules
file:
*******************
element "xref"
{
is fm cross-reference element "xref";
attribute "linkend" is fm property cross-reference id;
}
******************
However, any default paragraphFormat for an xref is not always
appropriate, and I was unable to develop context sensitive
element-paragraphFormat association rules for every case. So I changed
some xref formats manually. Also, and this may be because of something
I am doing wrong, when adding a xref element in FrameMaker the
attribute editor would not allow me to enter a linkend attribute value
for any "xref" elements added manually. An "xref" element with no
"linkend" attribute makes a document invalid (against both the EDD and
the DocBook DTD). Using FrameMaker's Special>Cross-Reference command
automatically entered the xref target element's "ID" attribute value,
but the value was always entered in the FrameMaker "endterm" element,
and the "linkend" element value was always left blank. All attempts to
change a a blank "linkend" value to its appropriate value resulted in
a FrameMaker popup stating that the attribute value should be changed
with the Special>Cross-Reference command. It was circular inanity. I
think the read/write rule for "linkend" in the rules file is correct.
The same problem also occurred whenever I tried to change the "linkend"
attribute value of an "xref" element imported in XML. Imported "xref"
elements worked OK otherwise, but the attributes could not be edited.
Upon examination of the C code in import.c (code in the FrameMaker
DocBook Starter Kit's DocBook XML custom client), which is the source
code responsible for translating DocBook elements and attributes to
FrameMaker elements and attributes, I could find no evidence of
"linkend" attribute values being copied to "endterm" attribute values.
I have no explanation to offer for any of FrameMaker's mysterious and
possibly buggy behavior with respect to xref elements. It appears to be
default buggy behavior that one is expected to override with a custom
client.
Public/System Identifiers.
To get FrameMaker to write a public identifier in output XML, I used
the following read/write rule:
*******
writer external dtd is public "-//OASIS//DTD DocBook XML V4.2//EN"
"/usr/share/docbook-xml42/docbookx.dtd";
*******
FrameMaker 7.0 cannot correctly write out a URL used as a system
identifier. For example,
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
is always changed to
"http:/www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
Note the missing forward slash.
Upon XML export.
1) Lots of fairly ordinary ascii characters were changed to internal
general entities(such as hyphens, etc.).
2) "imagedata" element "fileref" attributes became external parameter
entities
3) Each structured file in the FrameMaker book became an external
file referenced by an external parameter entity, but so did the
unstructured files (TOC, LOF, LOT, and Index). The unstructured
files had to be removed from the exported XML tree before
validating.
4) Many (all?) default attribute values not explicit in the input
XML were made explicit in the output XML.
I suppose all import/export behavior could be changed with a
"custom client," but that is like selling someone a piece of software
as "able to do _anything_ you want," and adding "you just have to do
some programming." What is this fabulous piece of software that is so
versatile it can do anything I want? Well, of course, it's a compiler!
Table formatting.
In order to apply a FrameMaker table format to an imported table, the
FrameMaker "table" elememnt in the EDD must be made FrameMaker element
type "Table". However, that causes conflicts with the content models
of child elements. So, the alternative is to leave the FrameMaker
element type for the "table" element at "Container", which is what
FrameMaker uses when creating an EDD from the DocBook XML DTD. The only
drawback is that FrameMaker cannot automatically apply table formats
to imported table elements. It's a minor inconvenience to manually
apply table formats.
Graphics.
I didn't bother to implement any image re-sizing or dpi adjustments
upon import, and I didn't attempt to control the size of the anchored
frames FrameMaker creates for each image upon import. It might help a
little, but I know very well that I almost always need to manually
adjust anchored frame and image sizes anyway.
Performance.
I'm running FrameMaker 7.0 on an IBM IntelliStation, dual CPUs at
733MHz, 1GB of PC 800 RDRAM, 15K SCSI, with a 32MB Nvidia card. The
machine is running Windows 2000 Server with all the patches and
updates. FrameMaker 5.5.6 runs nicely on this machine, as does almost
every program I have ever run on it. Nothing has ever bogged down,
except FrameMaker 7.0. Previous FrameMaker versions do not support
XML, and FrameMaker 7.0 is so slow on this machine it is nearly
unusable. Virtually any action in FrameMaker 7.0 uses %50 of the
maximum CPU cycles this machine can provide. Task Manager shows
FrameMaker using both CPUs, and performance does get a little better
if I shut down every other program, but even then FrameMaker 7.0's
performance is very bad. How bad? It is not uncommon for FrameMaker 7.0
to suck up %50 of max CPU for 30 to 60 seconds in response to just
switching between two open documents inside FrameMaker. Even with
FrameMaker the only program running, and with only one FrameMaker file
open, it still takes about 3 to 5 seconds to iconify or display an
already opened file. Like I said, nearly unusable. I watched a lot of
TV while waiting.
**************************************************************
Conclusion, Compare/Contrast.
*************************************************************
Based just on my experience, and at the risk of making everyone
mad, I must say that neither the FrameMaker solution or the DocBook XSL
solution with FOP is yet good enough for use in a professional environment.
Or, perhaps I should say that _I_ would find both solutions painful.
Structured FrameMaker is big, slow, expensive, complicated, messy,
deficient, and buggy. All indications are that FrameMaker has not been
actively developed for several years. That is the type of decision upper
management would make upon realizing the impact of the design flaws
described above. Nonetheless, one can pretty much get exact formatting
of DocBook XML in structured FrameMaker if he is determined. And, I am told
that if one is a C magician, he can get FrameMaker to do just about anything
through the FrameMaker Developer's Kit and Structured FrameMaker API.
Structured FrameMaker may be suitable only for large organizations who
require exact formatting capabilities and those organizations that can
recoup the required investments in multiple custom clients and elaborate
work flow designs that compensate for structured FrameMaker's deficiencies.
Unfortunately, because of the flaws in structured FrameMaker's application
design methodology, using it as is will always be labor intensive.
With XSL, one need not be a C magician. But to have any reasonable
degree of control with formatting XML, one needs to be able to write
XSL code. The DocBook XSL stylesheets help a lot. Some parts of an XSL
customization layer are very easy to implement. But XSL is generally
difficult. FOP is buggy and deficient, but it has active support from an
open-source development community. The libxml2 tools work fine for me,
and they also have strong open-source development support. Because
XSL+friends has a future, learning XSL is a good investment. With respect
to the actual processing of the XML, the DocBook XSL solution is very quick
and straight forward. It includes an automated process for formatting TOCs,
LOTs, LOFs, Indexes, etc. Also, it is a joy to escape all of FrameMaker's
import/export problems because XSL uses the original XML. With XSL, No
import/export is required.
Steve Whitlatch
To unsubscribe from this list, send a post to docbook-apps-unsubscribe@lists.oasis-open.org, or visit http://www.oasis-open.org/mlmanage/.