This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GDB/XMI (XML Machine Interface)


Felix Lee <felix.1@canids.net>:
> Bob Rossi <bob@brasko.net>:
> >    1. Have to write a parser. (regex, recursive decent)
> >       BTW, I guarantee the parser will have to be updated with every
> >       release of GDB.
> so far, I haven't found that xml is any less work than that, and
> it usually feels like a lot more work, but I haven't used xml for
> anything substantial yet, so it may just be unfamiliarity.

here's some elaboration.  this is what I think about xml parsers
today.  please correct me if I'm wrong.

there are two types of xml parsers, stream-based and tree-based.

using an xml stream parser is equivalent to writing a recursive
descent parser.  the stream parser basically just handles the
'tokenization' aspect of parsing xml (which is complicated by
considerations like character encoding, etc.)

to read data with an xml stream parser, you have to write
handlers that match the structure of the data you're parsing,
which is not any simpler than writing a recursive descent parser
for some other tree-like data format.

using an xml tree parser is complicated by xml's origin as a
markup language, which introduces issues that aren't particularly
relevant to data representation, but can't easily be ignored.

something like perl's XML::Simple tries to hide the messy details
and give you a natural data structure that corresponds to an xml
document, but there are a few problems that make XML::Simple
unsuitable for data that isn't "simple".

using a more general xml tree parser is harder.  in order to
access the data you want, you either have to walk the document
tree yourself (which is similar to writing a recursive descent
parser) or use XPATH descriptions to locate items in the tree
(which is similar to using regexps).

xml tree parsers also have the disadvantage of needing a lot of
memory.  the estimates are 10x to 30x the size of the xml
document, which puzzles me.  it's not clear to me why you'd need
more than about 2x.  (actually I'd expect more like 0.8x since
xml is redundantly verbose.)

with either stream parsers or tree parsers, if an xml schema
changes, you have to revise your code, unless the change is
careful to make only backward-compatible extensions.
guaranteeing that is hard for nontrivial changes, so people often
screw it up, or they play it safe and define a new schema.  in
either case, old code will often require updating anyway.
--


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]