This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Documentation conventions refocusing...


I'm really excited to see the amount of discussion we've had about
generating documentation for scwm and guile.  I'd like to refocus the
conversation so that we can make some real progress.  We all agree that
there's a difference between a reference manual and a tutorial or other
higher-level user guide.

My initial motivation, which the scwm developers and guile developers
seem to share, is to improve the reference documentation of the APIs for
scwm and guile.  There are lots of arguments as to why documenting
primitives should be done as close to the implementation as possible
(ease of update, awareness of documentation, consideration of the API
when writing the primitive, etc.).  There are also many success stories
involving the approach of embedding documentation in the source code.
There is a consensus that this is a reasonable way to produce some
subset of the reference manual (and a very important subset that is best
to happen early).

So now the technical question we must answer is:  What should that
source code markup look like and what conventions should we use?

There are two kinds of source we must annotate: C primitives and scheme
library procedures.  The former (to me the more important, since the
compiled code is less available to the end user) has an example format
in scwm/binding.c.  It is based on using SCWM_PROC, a macro that uses
guile's slightly less specific SCM_PROC.  An extraction tool (written in
perl for now--there is general agreement that it should instead be
written in guile ultimately) scans and extracts the comments (see scwm's
utilities/dev/extract-docs).

There are a couple issues:

1) Should the doc string be a final string argument to the macro, or
should it just be a slightly-stylized comment?

I'm in favor of the latter, since the extraction is extra-linguistic,
and needn't use a language-level mechanism.  Jim made the good point
that a macro argument and that macros expansion is subject to normal
conditional compilation, so if a feature were compiled out of binary,
then the documentation would appropriately be elided as well.  However,
it's easy to run the preprocessor w/o removing comments and still look
for the stylized comment, so I think if differing documentation for
alternate compile-time options is an issue, we can still handle it
either way.

Additionally, using comments permits what I called concept references;
my example was the specification of keystrokes --- the place to best put
that documentation is near the C function that parses key specifiers,
and that documentation could be linked to by all the primitives and
procedures that use a key-specifier argument (ultimately perhaps the
key-specifier should be a regular primitive anyway, so maybe the need
for separate "concept" comments on ordinary C functions will disappear
as some of our abstractions improve -- in this specific case, e.g., as
the new event model is written).

Another big advantage of comments is fewer quoting issues -- the only
special sequence is '*/'.  Strings can cause backslashitis-- ponder
documenting the regular expression engine in guile -- much harder if
you must worry about C's literal string quoting.


2) Within the documentation, what should the markup look like?  

In particular, there has been some consensus to generate SGML
(specifically DocBook DTD) from the comments.  This seems like the right
thing to me, as there are others working on the ability to generate
various formats from the DocBook DTD, and a handful of useful formats
are already able to be produced.

The subquestion is: 

To what extent should conventions be used to reduce the amount of
by-hand tagging in the source comment?  

It was pointed out that JavaDoc gains a bit of leverage by knowing
something about the language.  Similarly, I think our documentation can
benefit from conventions biased for our application.  The goal here is
to make writing the documentation as simple and painless as possible,
anticipating that this will increase the quality and ease-of-maintenance
of the documentation.  Note that these conventions are just in the
embedded comments (or macro argument strings) and are understood by the
extraction tool which then writes pure DocBook SGML as output.

There are two special cases for which some conventions are absolutely
necessary: 1) formal arguments, and 2) references to other
primitives/procedures.  E.g., an all-uppercase identifier in the
documentation should refer to a formal parameter name (a warning can be
given if it does not) and it should be tagged appropriately; similarly,
a legal scheme identifier embedded in between ` and ' can be a reference
to another primitive/procedure.  Thus we can write:

SCWM_PROC(unbind_key, "unbind-key", 2, 0, 0,
          (SCM contexts, SCM key))
     /** Remove any bindings attached to KEY in given CONTEXTS.
CONTEXTS is a list of event-contexts (e.g., <code-example>'(button1
sidebar)</code-example>) KEY is a string giving the key-specifier (e.g.,
<key-specifier>M-Delete</key-specifier> for META+Delete).
See also `bind-key'. */

Note the reduction in the number of tags needed for the above example by
the two simple and unambiguous conventions.  Where convention does not
make the logical content obvious, we need to use tags so that we
preserve as much information as possible. I also suggest using \< to
expand into &lt;, and \& to expand into & (instead of &lt; and &amp;
which I think are a lot less readable).  We could also forbid spaces from
following `<' or '&', and if we see a space, just take that to mean that
we want the literal character (since those characters will almost always
be separated from adjacent identifiers by spaces when they are used for
themselves instead of markup).




Though this is summary is longer than I'd hoped, I think it is
imperative that we at least make this handful of design decision very
soon -- and discuss any other design decisions that need to be made
before we can start writing the documentation.

Bottom line:  we need documentation, we need it yesterday, and I'd like
to see us to be able to move forward with at least the content of the
documentation ASAP!

Greg J. Badros
gjb@cs.washington.edu
Seattle, WA  USA
http://www.cs.washington.edu/homes/gjb