This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Defining an 'appropriate' DTD for XSLT


The following question is very general.
It concerns the 'appropriate' definition of a DTD in view
of allowing a easy handling with XSLT rules.

More concretely, it concerns the issue whether to use elements
or attributes, i.e. whether  to encode some information
as 'content' (1) or as value of an attribute (2), e.g.

(1) <word>London</word>
(2) <input word="London"/>

This decision can of course not be taken without giving you some
more information about the xml which we produce:

The XML describes the  analysis of  a sentence, i.e.
the sentence is split up into words and
every word contains syntactic and semantic information.

An example of the description of the word "New York"
is given here (what the elements/attributes all mean is not
important here) :

...
<token>
      <snlp category="pnoun" number="singular" gender="neutrum"/>
      <surface.form>New York</surface.form>
      <normalized.form>New_York</normalized.form>
      <domlex attribute="location" subattribute="city"
value="New_York"/>
  </token>
....


Basically the task of the XSLT rules will be e.g. to verify whether
there are some tokens in the sentence which have
specific values/contents and, if yes,
to output other specific values/contents.

So, for example, I might want to
- verify whether there is a domlex[@attribute='location'] and
domlex[@attribute='city'] and, if yes,
- output the domlex[@attribute='value'] or in another case
- output the content of <surface.form>


So this is our context. Now, basically, as you can see from the example,

the dtd contains (i) some information encoded as "content",
and (ii) some information encoded as values of attributes.

Usually, though, the XSLT rules don't need  to output the information
which is encoded as "content", i.e. we have to prevent anyway
the leaking of this content in the XSL output.

So, here finally comes my question:
although I know that there are ways to prevent the leaking of content,
I was wondering whether in this case , the dtd would  actually
better encode information in attribute-value pairs.


Has anybody any ideas or any experience with this kind of topic?
Thanks in any case.

Cheers,
Sabine




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]