This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: keys and idrefs - XSLT2 request?


DPawson@rnib.org.uk wrote:
> Subject: RE: [xsl] keys and idrefs - XSLT2 request?
> > From: Joerg Pietschmann 
> > IMHO it's unnecessary to incorporate a special handling of
> > IDREFS. At first, it would only work if there is a DTD/Schema
> > and the document is verified against it.
> 
> Without the DTD they are a string, not IDREFS.

That's what i'm saying: you'll have to read a DTD/Schema to get
the benefit of special handling. IMHO with the huge DTD/Schemas
now often in use, this may be a rather big overhead for a
comperatively small gain.

> > Last but not least, there is not really a big need for using
> > IDREFS. You can easily design a structure which will carry
> > the same information while being much more amenable to XSL
> > processing 
> 
> So now we restrict the use of XML to that which our tools
> can process? That doesn't sound very logical to me.
> 
You got me a bit wrong. As you certainly know, XML derives
at least part of its success from having shaved off a lot
of the complexity of SGML, which in turn made XML parsers
faster, cheaper, more robust and, in general, a commodity.

I'm proposing to drop even more complexity from the standards
instead of piling it up. The latter will make the software
also more complex, more expensive, late on market and buggy.

Let me recapitulate the important concepts in XML and related
standard and the problems they solve (from my point of view):

1. The concept of having a structured data, in form of a tree
  structure, and a corresponding serialization onto a character
  or byte stream or document.
  This solves the problem of getting structured data from one
  machine to another, as much as TCP/IP allows us to get bytes
  from one machine to another.
  The concept can also be used to persistently store such a
  tree.
  Apart from the serialization, there are also representations of
  the data as event stream (SAX) or a static tree (DOM).
  We have corresponding APIs to read serialized XML data and quite
  a few parsers implementing that API. Surprisingly enough, we
  appear to lack serializers apart from the serializers built into
  XSL processors. This is partly due to historical reasons (SMGL
  was mainly intended to be authored manually), and partly because
  output control appears to be an integral part of XSL and there
  is no real pressure to move it into a standard of its own right.
  Indeed, the easiest way to serialize an in-memory XML document
  is to run an XSLT-processor with the identity transformation.
  We should rather enhance the existing SAX and DOM APIs with
  serialization specific data and methods (controlling method,
  character set, indentation, CDATA, disable-output-escaping etc.)
  so that serializers could be made truly stand-alone.
2. The concept of having a description for the structure of the
  data, possibly adding universally used primitive data types.
  This is solved both by DTDs and XSchema, the latter using the
  same concepts for representing the description as for representing
  the data itself, thereby avoiding the complexity of dealing with
  another representation concept as DTDs use.
  The description allows standard software to check a XML document
  against a very often used class of rules, the structure definition.
3. The concept of a hierachical adressing of subsets of the data.
  This is the basic XPath ("/foo/bar[@stuff='mumble']"). This
  resembles somewhat stuff already implemented by hierarchical
  databases decades ago, of course, XML+XPath is much more
  powerful (except maybe for performance).
4. The concept of lookup subsets of the data defined by a rule. This
  is xsl:key plus key(). This reminds more of relational databases,
 for a good reason.
5. The concept of transforming a data structure into another. This
  is what the bulk of XSLT actually does. Such a transformation
  may more or less keep the semantics, or in may change it completely.
6. The concept of assigning various semantics to the data. There are
  enough standards proposing vocabularies, however, some are more
  generals than others. There are also many overlapping developments.
The concepts up to 5 are quite universal and build on each other.
Thats why we see standard software implementing them. What we don't
need is to enhance them willy-nilly without paying attention to
keep the interfaces between them simple and slender.

It could be argued that the ID/IDREF/IDREFS combo is a datatype, a
reference, therefore it should be included into concept 2 and the
following concepts should get the tools to deal with it. I beg
to disagree. First, the reference concept is also incorporated in
concept 4, which is of course broader than ID/IDREFS. The
difference is where it is defined what actually is a "reference".
There is nothing which would forbid to introduce the concept of
keys in XSchema, and an XSLT program should be able to read it
from there. Of course, we still need xsl:key for
- ad-hoc semantic and references not expressend in a schema
- processing XML without an attached schema definition
- technical keys (Muenchean Grouping etc.)
Nevertheless, the concept of haviong keys is independent of the
concept of transformations, and it deserves its of standard and API,
like XPath. If designed properly, we could get rid of ID/IDREFS.
Another point i don't like is that the "white space separated
list of ids" defines actually a language, though admittedly a
very simple one. If it's handled at all, it should be handled at
the parser level, instead at higher levels during XPath processing.
This introduces the concept of attributes having a list as value,
a novelty i don't think is worth having. I'ts better to reuse
concepts already used (and needed) otherwise, and keeping the
software simpler.

Oh well, i forgot to mention the namespace concept and what
problem it solves. Anybody there fitting it in? :-)

Regards
J.Pietschmann

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]