This is the mail archive of the docbook@lists.oasis-open.org mailing list for the DocBook project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: doc domain vs. problem domain semantics

From: Norman Walsh <ndw at nwalsh dot com>
To: "Matt G." <matt_g_ at hotmail dot com>
Cc: docbook at lists dot oasis-open dot org
Date: Thu, 03 Jan 2002 12:23:12 -0500
Subject: DOCBOOK: Re: doc domain vs. problem domain semantics
References: <F4dMogttO0M8xQGBESO0001708f@hotmail.com>
/ "Matt G." <matt_g_@hotmail.com> was heard to say:
>>From: Norman Walsh <ndw@nwalsh.com>
>>/ "Matt G." <matt_g_@hotmail.com> was heard to whine:
>>I'm not sure I understand what you're trying to say.
[...]
| the two former.  What I was saying is that "tag abuse", as you called
| it, effectively ruins the semantics of the abused tag.  So, a

A few observations:

1. The semantics in question are only "ruined" for the document(s) in
which tag abuse occurs.

2. You can use the role attribute to distinguish your "abusive" uses
from "real" uses and thereby avoid ruining anything irreparably.

3. The DocBook Technical Committee (TC) is actively maintaining
DocBook. If you have a construct for which there is no suitable tag,
and the problem domain you are working in is not too far afield,
chances are the TC will address the issue.

4. If you need a new element and either can't wait for the TC to
consider it, are if the TC rejects your use case for some reason, you
can always add it yourself.

>>The entire design of DocBook is geared to make it possible for
>>you to write customization layers that provide the exact markup
>>that you need.
>
| Right, but do you think DocBook is rich enough to serve as an
| intermediate format for most types of publications, without resorting
| to tag abuse?  In other words, are its document domain semantics
| sufficiently rich to provide all the structural constructs most
| documents need?

1. For computer software and hardware documentation: yes.

2. For the vast majority of "ordinary" documents (the things people
write about outside of any specific community): qualified yes. 

3. For communities other than computer software and hardware
documentation: maybe.

4. For *all* communities outside of computer documentation: not a
chance.

| If not, do you consider this goal to be realistic?  If you do, then
| how far off the mark do you consider DocBook to be?

For 1 and 2, I think it's realistic. For 3 and 4, I don't.

| Where would you draw the line; for what types of publications could
| the document domain semantics of DocBook (or a spiffed up version) be
| used, as an intermediate format?

Intermediate between what and what?

| Textbooks?  Newspapers?  Magazines?
| (The latter two are really collections of documents, of course.)  How
| would you characterize the dichotomy between document structures that
| are (or would be) supported and those that aren't?  For example, it's
| true that some magazines are awfully layout-oriented, but if DocBook
| (or some derivative format) isn't suitable for authoring them, why
| not?  Where does the real problem lie?

I think we can divide the problem in half at least right at this
juncture.

On the one hand, there are publications for which vastly more
presentational information is required (layout-driven magazine
publication, for example). I don't think DocBook should go there. If
you really wanted to keep structure and presentation separate, and
let's say you wanted to use DocBook for the structural part, my "off
the top of my head" solution would be to design a new vocabulary for
describing highly detailed presentational semantics and then point
from that document back into the "semantic" DocBook document.

On the other hand, there are publications that have highly detailed
semantic constructs that aren't used in computer software and hardware
documentation. I wouldn't be at all surprised if the medical
bioinformatics community has semantic constructs that wouldn't fit
cleanly into DocBook. I think if we tried to make DocBook the kitchen
sink of semantic markup, we'd end up with 2000 elements and the whole
enterprise would collapse under its own weight.

My recommendation, if you want to use DocBook in another community,
would be that you find a few other people in that community that share
your interest and design the semantic constructs that you need. Then
make a customization layer of DocBook that discards the things you
don't need and add the things you do. (I'd be happy to participate, at
least as an observer, in such a process.)

>> >>| More importantly (in the
>> >>| short-term) it doesn't even appear to be nested, at all, in >>|
>> the DSSSL print style-sheets (version 1.74b - the latest).
>>
>>I've lost the beginning of this thread, what doesn't appear nested?
>
| variablelists.  They don't nest properly, with DSSSL print
| style-sheets (version 1.74b), using the TeX backend & OpenJade v1.3.
| I suppose I should whip up an example and submit a bug report.

Yes, please. I suspect this is going to turn out to be a JadeTeX bug,
but I could be wrong.

>>| So, is there really no desire to augment it to be better suited
>>| for more general documentation tasks and more easily adaptable
>>| to other sorts of problem domains than HW/SW?
>>
>>There are thousands of things that we could add that would
>>ideally suit the needs of one community or another.  DocBook
>>could be extended to provide structures suitable for medical
>>publishing, for legal publishing, for automotive manufacturing
>>publishing, etc. ad infinitum.
>
| See, that's exactly *not* what I'm talking about.  I'm wondering how
| suitable of a *foundation* (for layering or augmentation) you think
| DocBook is or could be, so *others* could leverage much of the work
| done on DocBook and many of the existing (and future) tools.

Oh. Well. Uhm. Maybe I should have read the whole message carefully
before I started replying. :-) Ignore half of what I just said. :-)

I think DocBook is a pretty good foundation. I expect it could be
refactored a little more cleanly if this was a primary design goal.
Since we (the designers) have only been working in one community, I
expect we haven't kept that separation very cleanly.

| I agree (with you and those people).  Though I disagree with the
| approach of Simplified DocBook (possibly because it's intended to
| solve some problems I'm not concerned with).  I think a more
| appropriate solution would be to partition the elements into a
| document domain group, and a number of different problem
| domain-specific groups (e.g. publishing meta, program sourcecode doc,
| program usage doc, hw/sw concepts, and misc.).  Put them in separate
| schemas, and maybe even namespaces.  Also, document them in separate
| groups.

Looking at this pragmatically, I observe that what you're suggesting
would be *a lot* of work and it wouldn't directly benefit DocBook's
principal community in any direct way. That isn't a good reason not to
do it, but it does mean that I want to wait until there's at least one
other community that would directly benefit from this exercise.

| However, here's a suggestion: rather than simply structuring it that
| way, internally, why not do one or both of the following:
| 	* Document it that way, rather than just lumping all the
| 	  elements together
| 	* provide a release of the DTD and/or stylesheets without
| 	  any of the HW/SW-specific stuff.

I tell you what. If you take the list of elements in DocBook and
divide them into those two groups: foundational and HW/SW-specific,
post your division to the list, and see if there's any disagreement,
and if we (the readers and posters on the list) can reach a mutual
understanding of where the dividing line is, I'll consider it.

I think you'll find 100 elements in the former catagory, 100 in the
latter, and about 100 that no one can agree on.

| Huh?  What do you mean by "included fragments"?  You mean like the
| 'fileref' attribute of <imagedata> instances?  That's an example of
| what I think it'd be nice to use a command-line XPath or XQuery tool
| to collect.  I'll probably just end up writing an XSLT script to do
| it, though (obviously, a separate means would be necessary to collect
| entity references, unless XSLT 2.0 includes this info).

I often use tools to extract bits of files or preprocess files to
produce something I can include in my document. For example, this
Makefile rule extracts a fragment of addrbook-old.xml and produces
address.1 which I include in my source document.

  address.1: addrbook-old.xml
             xinclude -d -x "/*/address[1]" $< $@

A tool that notices that mydoc.xml depends on address.1 isn't very
useful (IMHO). And I can't think of any way to encapsulate the rule
above in my document for an automatic tool to extract.

| resolution.  There's no way I want to be forced to maintain a separate
| list of locations for each entity I'm using in my document.

If -I would find it for you, why do you have to maintain it by hand?

Actually, I think I need an example, I'm not sure what you're looking
for.

| Why is OpenJade's '-D' (which works like '-I', for most C
| preprocessors) a bad way to go?  I think it's the best tradeoff
| between control, ease of use, and low maintenance burden, for my
| purposes.  I just wish Xalan supported it.

>From a purist point of view, system identifiers in SGML aren't URIs.
In XML they are URIs and URIs are made absolute with respect to the
appropriate base URI in a standard way. So <xxx href="foo.xml"/>
always resolves to <xxx href="scheme:///path/foo.xml"/>. Searching for
foo.xml on a path is a bit of a stretch.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com>      | Allow children to be happy in
http://www.oasis-open.org/docbook/ | their own way, for what better way
Chair, DocBook Technical Committee | will they ever find?--Dr. Johnson
References:
- Re: doc domain vs. problem domain semantics
  - From: Matt G.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]