This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: First-class environment proposal


Hi Jim,

The below are my suggestions and comments.  I use unified diff format
since there are some simple typo fixes, too.  My comments are prefixed
by ** at the start of lines.  This looks really good, though there are a
couple of places where I was a bit confused as to what was intended.

Greg

--- guile-env-proposal.orig	Tue Mar 16 08:58:34 1999
+++ guile-env-proposal	Tue Mar 16 09:50:52 1999
@@ -1,22 +1,22 @@
-$Id: guile-env-proposal,v 1.1 1999/03/15 21:57:59 gjb Exp $
+$Id: guile-env-proposal,v 1.1 1999/03/15 21:57:59 gjb Exp gjb $
 
    This is a draft proposal for a new datatype for representing
 top-level environments in Guile.  Upon completion, this proposal will
 be posted to the mailing list `guile@cygnus.com' for discussion,
 revised in light of whatever insights it may produce, and eventually
 implemented.
 
    Note that this is *not* a proposal for a module system; rather, it
-is a proposal for a data structure which encapsulates the ideas one
+is a proposal for a data structure which encapsulates the ideas one encounters
 when writing a module system, and, most importantly, a fixed interface
-which insulates the interpreter from the details of the module system.
+which insulates the interpreter from the details of a module system.
 Using these environments, one could implement any module system one
 pleased, without changing the interpreter.
 
    I hope this text will eventually become a chapter of the Guile
-manual; thus, the description of environments in written in the present
+manual; thus, the description of environments is written in the present
 tense, as if it were already implemented, not in the future tense.
 However, this text does not actually describe the present state of
 Guile.
 
    I'm especially interested in improving the vague, rambling
@@ -49,10 +49,13 @@
 
    When there is a C function which provides the same functionality as a
 primitive, but with a different interface tailored for C's needs, it
 usually has the same name as the primitive's C function, with the suffix
 `_internal'.  Thus, `scm_env_ref_internal' is almost identical to
+
+** I agree that `_internal' is a bad name for exported functions
+
 `scm_env_ref', except that it indicates an unbound variable in a manner
 friendlier to C code.
 
    Copyright 1999 Free Software Foundation, Inc.
 
@@ -67,19 +70,22 @@
 ========================
 
    Guile distinguishes between environments and modules.  A module is a
 unit of code sharing; it has a name, like `(math random)', an
 implementation (e.g., Scheme source code, a dynamically linked library,
-or a set of primitives built into Guile), and finally, an environment
-containing the definitions which the module exports for its users.
+or a set of primitives statically linked into Guile), and finally, an environment
+containing the definitions that the module exports for its users.
 
    An environment, by contrast, is simply an abstract data type
 representing a mapping from symbols onto variables which the Guile
 interpreter uses to look up top-level definitions.  The `eval'
 procedure interprets its first argument, an expression, in the context
 of its second argument, an environment.
 
+** This is a change in `eval', right?  (More like what `eval2' does, it seems)
+** maybe this should be documented, too.
+
    Guile uses environments to implement its module system.  A module
 created by loading Scheme code might be built from several environments.
 In addition to the environment of exported definitions, such a module
 might have an internal top-level environment, containing both exported
 and private definitions, and perhaps environments for imported
@@ -112,10 +118,15 @@
 user does not see the module's private definitions, and the module is
 unaffected by definitions the user makes at the prompt.  The
 `use-modules' form copies the module's public bindings into the user's
 environment.
 
+** This last sentence sounds like mechanism, not semantics; it seems
+** as though this could/should be done with a reference, too
+** (note from later: it is done with references sometimes, so this
+** is misleading).
+
    All Scheme evaluation takes place with respect to some top-level
 environment.  Just as the procedure created by a `lambda' form closes
 over any local scopes surrounding that form, it also closes over the
 surrounding top-level environment.  Thus, since the `string-match'
 procedure is defined in the `(ice-9 regex)' module, it closes over that
@@ -129,12 +140,12 @@
 misleading to extend the concept of a "current top-level environment"
 to the system as a whole.  Each procedure closes over its own top-level
 environment, in which that procedure will find bindings for its free
 variables.  Thus, the top-level environment in force at any given time
 depends on the procedure Guile happens to be executing.  The global
-"current" environment is a figment of the interaction loop's
-imagination.
+"current" environment is simply the top-level environment of the 
+interaction loop, and is in no way special.
 
    Since environments provide all the operations the Guile interpreter
 needs to evaluate code, they effectively insulate the interpreter from
 the details of the module system.  Without changing the interpreter, you
 can implement any module system you like, as long as its efforts produce
@@ -210,10 +221,13 @@
                       '()))
 
  - Libguile macro: int SCM_ENVP (OBJECT)
      Return non-zero iff OBJECT is an environment.
 
+** was not immediately clear to me that this was a C macro
+** maybe we need a better cue that you're talking about the C interface?
+
  - Libguile function: SCM scm_env_ref_internal (SCM ENV, SCM SYMBOL)
      This C function is identical to `env-ref', except that if SYMBOL
      is unbound in ENV, it returns the value `SCM_UNDEFINED', instead
      of signalling an error.
 
@@ -225,12 +239,19 @@
 
      where PREVIOUS is the value returned from the last call to
      `*PROC', or INIT for the first call.  If ENV contains no bindings,
      return INIT.
 
+** you don't mention the DATA argument at all
+** also, perhaps DATA should be the 4th argument instead of the 1st.
+
  - Libguile data type: scm_env_folder SCM (SCM DATA, SCM SYMBOL, SCM
           VALUE, SCM TAIL)
+
+** I don't like this syntax for a data type.  Perhaps show the whole
+** typdef.  It's ugly, but at least we're used to that ugliness.
+
      The type of a folding function to pass to `scm_env_fold_internal'.
 
 Changing Environments
 ---------------------
 
@@ -249,10 +270,17 @@
 simply because a binding cannot be changed via these functions does
 *not* imply that it is constant.  Mechanisms outside the scope of this
 section (say, re-loading a module's source code) may change a binding
 or value which is immutable via these functions.
 
+** Not completely clear what those functions would do; I thought these 
+** primitives were the only where to manipulate environments.  Are there
+** some file-related environment constructors/manipulators that can also
+** create and mutate environments?  That seems hard, given that those
+** things are going to be a part of the real (i.e., user-visible) module 
+** system.
+
  - Primitive: env-define ENV SYMBOL VALUE
      Bind SYMBOL to a new location containing VALUE in ENV.  If SYMBOL
      is already bound to another location in ENV, that binding is
      replaced.  The new binding and location are both mutable.  The
      return value is unspecified.
@@ -273,27 +301,35 @@
 
      If SYMBOL is not bound in ENV, signal an `env:unbound' error.  If
      ENV binds SYMBOL to an immutable location, signal an
      `env:immutable-location' error.
 
+** So this is what:  "(eval '(set! SYMBOL VALUE) ENV)" would do?
+
 Caching Environment Lookups
 ---------------------------
 
    Some applications refer to variables' values so frequently that the
 overhead of `env-ref' and `env-set!' is unacceptable.  For example,
 variable reference speed is a critical factor in the performance of the
 Guile interpreter itself.  If an application can tolerate some
-additional complexity, the `env-cell' function described here can
+additional complexity, the `env-cell' primitive described here can
 provide very efficient access to variable values.
 
    In Guile, most variables are represented by pairs; the CDR of the
 pair holds the variable's value.  Thus, a variable reference corresponds
+
+** What does the CAR hold?  Don't omit this detail.
+
 to taking the CDR of one of these pairs, and setting a variable
 corresponds to a `set-cdr!' operation.  A pair used to represent a
 variable's value in this manner is called a "value cell".  Value cells
 represent the "locations" to which environments bind symbols.
 
+** Perhaps move the primitives' specifications up to here (from below) 
+** before further discussion
+
    The `env-cell' function returns the value cell bound to a symbol.
 For example, an interpreter might make the call `(env-cell ENV SYMBOL
 #f)' to find the value cell which ENV binds to SYMBOL, and then use
 `cdr' and `set-cdr!' to reference and assign to that variable, instead
 of calling `env-ref' or ENV-SET! for each variable reference.
@@ -331,10 +367,13 @@
      Programs should therefore make separate calls to `env-cell' to
      obtain value cells for reference and for assignment.  It is
      incorrect for a program to call `env-cell' once to obtain a value
      cell, and then use that cell for both reference and mutation.
 
+** Why can't I call it once w/ FOR-WRITE == #t and then use it for
+** both reference and mutation?
+
  - Primitive: env-cell ENV SYMBOL FOR-WRITE
      Return the value cell which ENV binds to SYMBOL, or `#f' if the
      binding does not live in a value cell.
 
      The argument FOR-WRITE indicates whether the caller intends to
@@ -352,10 +391,14 @@
           int for_write)
      This C function is identical to `env-cell', except that if SYMBOL
      is unbound in ENV, it returns the value `SCM_UNDEFINED', instead
      of signalling an error.
 
+** Will it still signal env:immutable-location?  Seems bad to have to
+** deal with guile signals from C code.
+** also, same issue with _internal suffix.
+
    [[After we have some experience using this, we may find that we want
 to be able to explicitly ask questions like, "Is this variable mutable?"
 without the annoyance of error handling.  But maybe this is fine.]]
 
 Observing Changes to Environments
@@ -381,10 +424,16 @@
      This function returns an object, TOKEN, which you can pass to
      `env-unobserve' to remove PROC from the set of procedures
      observing ENV.  The type and value of TOKEN is unspecified.
 
  - Primitive: env-unobserve TOKEN
+
+** Perhaps env-unobserve should still take the ENV argument as the
+** first arg?  Though not necessary (TOKEN could include a ref to the ENV)
+** still is nice for consistency, plus the dispatch table has a SELF
+** argument there.
+
      Cancel the observation request which returned the value TOKEN.
      The return value is unspecified.
 
      If a call `(env-observe ENV PROC)' returns TOKEN, then the call
      `(env-unobserve TOKEN)' will cause PROC to no longer be called
@@ -393,18 +442,20 @@
    There are some limitations on observation:
    * These procedures do not allow you to observe specific bindings; you
      can only observe an entire environment.
 
    * These procedures observe bindings, not locations.  There is no way
-     to receive notification when a location's value changes, using
+     to receive notification when a location's value changes using
      these procedures.
 
    * These procedures do not promise to call the observing procedure
      for each individual binding change.  However, if multiple bindings
      do change between calls to the observing procedure, those changes
      will appear atomic to the entire system.
 
+** Clarify "appear atomic".  I'm not sure what this means.
+
    * Since a single environment may have several procedures observing
      it, a correct design obviously may not assume that nothing else in
      the system has yet observed a given change.
 
    When writing observing procedures, pay close attention to garbage
@@ -434,10 +485,13 @@
      reference ENV retains to PROC is a weak reference.  This means
      that, if there are no other live, non-weak references to PROC, it
      will be garbage-collected, and dropped from ENV's list of
      observing procedures.
 
+** Can you give an example of what else might reference PROC to prevent
+** it from being collected?
+
    It is also possible to write code that observes an environment in C.
 The `scm_env_observe_internal' function registers a C function to
 observe an environment.  The typedef `scm_env_observer' is the type a C
 observer function must have.
 
@@ -455,10 +509,12 @@
  - Libguile data type: scm_env_observer void (SCM ENV, SCM DATA)
      The type for observing functions written in C.  A function meant
      to be passed to `scm_env_internal_observe' should have the type
      `scm_env_observer'.
 
+** Again, I do not like this format for presenting data type declarations.
+
    Note that, like all other primitives, `env-observe' is also
 available from C, under the name `scm_env_observe'.
 
 Environment Errors
 ------------------
@@ -507,11 +563,11 @@
  - Primitive: make-finite-env
      Create a new finite environment, containing no bindings.  All
      bindings and locations in the new environment are mutable.
 
  - Primitive: finite-env? OBJECT
-     Return `#t' if OBJECT is a finite environment, or #F otherwise.
+     Return `#t' if OBJECT is a finite environment, or #f otherwise.
 
    In Guile, each module of interpreted Scheme code uses a finite
 environment to hold the definitions made in that module.
 
 Eval Environments
@@ -538,11 +594,11 @@
      Note that EVAL incorporates LOCAL and IMPORTED *by reference* --
      if, after creating EVAL, the program changes the bindings of LOCAL
      or IMPORTED, those changes will be visible in EVAL.
 
      Since most Scheme evaluation takes place in EVAL environments,
-     they transparenty cache the bindings received from LOCAL and
+     they transparently cache the bindings received from LOCAL and
      IMPORTED.  Thus, the first time the program looks up a symbol in
      EVAL, EVAL may make calls to LOCAL or IMPORTED to find their
      bindings, but subsequent references to that symbol will be as fast
      as references to bindings in finite environments.
 
@@ -567,19 +623,34 @@
      Return a new environment IMP whose bindings are the union of the
      bindings from the environments in IMPORTS; IMPORTS must be a list
      of environments.  That is, IMP binds SYMBOL to LOCATION iff some
      element of IMPORTS does.
 
+** should this just be "only if".  i.e. "IMP binds the symbol" implies
+** "some element of IMPORTS does" (but if some element of IMPORTS does
+** and another conflicts, we don't know whether IMP binds SYMBOL to LOCATION).
+
      If two different elements of IMPORTS have a binding for the same
+
+** re: "different": are repeated environments in IMPORTS ignored?
+
      symbol, apply CONFLICT-PROC to the two environments.  If the
+
+** CONFLICT-PROC needs three args, right?  The two environments and the
+** conflicting symbol.
+
      bindings of any of the IMPORTS ever changes, check for conflicts
      again.
 
-     All bindings in IMP are immutable.  If you apply`env-define' or
+** Seems expensive if a single binding has changed; may not be an issue 
+** since the bindings of environments in IMPORTS are probably done in batch
+** but those changes aren't well outlined here.
+
+     All bindings in IMP are immutable.  If you apply `env-define' or
      `env-undefine' to IMP, Guile will signal an
      `env:immutable-binding' error.  However, notice that the set of
-     bindings in IMP may still change, if one of its imported
+     bindings in IMP may still change if one of its imported
      environments changes.
 
  - Primitive: import-env? OBJECT
      Return `#t' if OBJECT is an import environment, or `#f' otherwise.
 
@@ -590,18 +661,23 @@
  - Primitive: import-env-set-imports! ENV IMPORTS
      Change ENV's list of imported environments to IMPORTS, and check
      for conflicts.
 
    I'm not at all sure about the way CONFLICT-PROC works.  I think
+
+** As mentioned above, it should at least get the SYMBOL as well as 
+** the args.  An example here would be useful and perhaps help
+** clarify your thinking, too.
+
 module systems should warn you if it seems you're likely to get the
 wrong binding, but exactly how and when those warnings should be
 generated, I don't know.
 
 Export Environments
 -------------------
 
-   An export environment restricts an environment a specified set of
+   An export environment restricts an environment a to specified subset of
 bindings.
 
  - Primitive: make-export-env PRIVATE SIGNATURE
      Return a new environment EXP containing only those bindings in
      PRIVATE whose symbols are present in SIGNATURE.  The PRIVATE
@@ -613,10 +689,13 @@
      SIGNATURE is a list specifying which of the bindings in PRIVATE
      should be visible in EXP.  Each element of SIGNATURE should be a
      list of the form:
           (SYMBOL ATTRIBUTE ...)
 
+** What are the dots?  Extensibility of possible attributes?  Right
+** now this seems like a weird interface to specify one boolean value.
+
      where each ATTRIBUTE is one of the following:
     the symbol `mutable-location'
           EXP should treat the location bound to SYMBOL as mutable.
           That is, EXP will pass calls to ENV-SET! or `env-cell'
           directly through to PRIVATE.
@@ -634,28 +713,33 @@
      It is an error for an element of SIGNATURE to specify both
      `mutable-location' and `immutable-location'.  If neither is
      specified, `immutable-location' is assumed.
 
      As a special case, if an element of SIGNATURE is a lone symbol
-     SYM, it is equivalent to an element of the form `(SYM)'.
+     SYM, it is equivalent to an element of the form `(SYM)' 
+     and thus corresponds to an immutable-location.
 
-     All bindings in EXP are immutable.  If you apply`env-define' or
+     All bindings in EXP are immutable.  If you apply `env-define' or
      `env-undefine' to EXP, Guile will signal an
      `env:immutable-binding' error.  However, notice that the set of
      bindings in EXP may still change, if the bindings in PRIVATE
      change.
 
  - Primitive: export-env? OBJECT
      Return `#t' if OBJECT is an export environment, or `#f' otherwise.
 
  - Primitive: export-env-private ENV
- - Primitive: export-env-set-private! ENV
+ - Primitive: export-env-set-private! ENV PRIVATE
  - Primitive: export-env-signature ENV
- - Primitive: export-env-set-signature! ENV
+ - Primitive: export-env-set-signature! ENV SIGNATURE
      Accessors and mutators for the private environment and signature of
      ENV; ENV must be an export environment.
 
+** forgot the 2nd args for the mutators, above.
+** Also, these alter the behaviour as if they were the args in the ctr, 
+** right?
+
 Implementing Environments
 =========================
 
    This section describes how to implement new environment types in
 Guile.
@@ -672,10 +756,13 @@
 
    An environment object is a smob whose CDR is a pointer to a pointer
 to a `struct env_funcs':
      struct env_funcs {
        SCM  (*ref) (SCM self, SCM symbol);
+
+** is bound_p skipped intentionally?  Seems necessary
+
        SCM  (*fold) (SCM self, scm_env_folder *proc, SCM data, SCM init);
        void (*define) (SCM self, SCM symbol, SCM value);
        void (*undefine) (SCM self, SCM symbol);
        void (*set) (SCM self, SCM symbol, SCM value);
        SCM  (*cell) (SCM self, SCM symbol, int for_write);
@@ -684,10 +771,14 @@
        SCM  (*mark) (SCM self);
        scm_sizet (*free) (SCM self);
        int  (*print) (SCM self, SCM port, scm_print_state *pstate);
      };
 
+** what about the internal fns?  It might be useful to have a dispatch table
+** struct for these, too.  Or are the internal (C) functions just wrappers
+** around the primitive versions (seems weird that they'd be less efficient.
+
    You can use the following macro to access an environment's function
 table:
 
  - Libguile macro: struct env_funcs *SCM_ENV_FUNCS (ENV)
      Return a pointer to the `struct env_func' for the environment ENV.
@@ -704,10 +795,13 @@
      *Note Examining Environments::.
 
      Note that the `ref' element of a `struct env_funcs' may be zero if
      a `cell' function is provided.
 
+** and all bindings live in value cells?  Seems like this condition would
+** be necessary, too.
+
 `SCM fold (SCM self, scm_env_folder *proc, SCM data, SCM init);'
      This function must have the effect described above for the C call:
           scm_env_fold_internal (SELF, PROC, DATA, INIT)
      *Note Examining Environments::.
 
@@ -730,24 +824,33 @@
      *Note Changing Environments::.
 
      Note that the `set' element of a `struct env_funcs' may be zero if
      a `cell' function is provided.
 
-`SCM cell (SCM self, SCM symbol, int for_write);'
+** As above, do we not also need the condition that all bindings live 
+** in value cells?
+
+
+`SCM cell (SCM self, SCM symbol, SCM for_write_p);'
      This function must have the effect described above for the C call:
-          scm_env_cell_internal (SELF, SYMBOL)
+          scm_env_cell_internal (SELF, SYMBOL, for_write_p)
      *Note Caching Environment Lookups::.
 
 `SCM observe (SCM self, scm_env_observer *proc, SCM data, int weak_p);'
      This function must have the effect described above for the C call:
-          scm_env_observe_internal (ENV, PROC, DATA, WEAK_P)
+          scm_env_observe_internal (SELF, PROC, DATA, WEAK_P)
      *Note Observing Changes to Environments::.
 
 `void unobserve (SCM self, SCM token);'
      Cancel the request to observe SELF that returned TOKEN.  *Note
      Observing Changes to Environments::.
 
+** Changed your style from "This function must have the effect..."
+** Also note that here unobserve has the ENV so I think env-unobserve
+** should take two args, the first the ENV, even if it could be made
+** redundant
+
 `SCM mark (SCM self);'
      Set the garbage collection mark all Scheme cells referred to by
      SELF.  Assume that SELF itself is already marked.  Return a final
      object to be marked recursively.
 
@@ -758,10 +861,15 @@
 
 `SCM print (SCM self, SCM port, scm_print_state *pstate);'
      Print an external representation of SELF on PORT, passing PSTATE
      to any recursive calls to the object printer.
 
+** Why can't an ENV be a SMOB with a larger v-table;  i.e., why can't
+** the mark/free/print function pointers go at the beginning of this
+** table and have the env-specific functions be additional entries
+** in that table unique to the "ENV SMOB"?
+
 Environment Data
 ----------------
 
    When you implement a new environment type, you will likely want to
 associate some data of your own design with each environment object.
@@ -770,22 +878,31 @@
 of an environment smob point to your structure, as long as your
 structure's first element is a pointer to a `struct env_funcs'.  Then,
 your code can use the macro below to retrieve a pointer to the
 structure, and cast it to the appropriate type.
 
+** Not sure what's going on here--- is the CDR of a SMOB always
+** a pointer to a pointer to the env_funcs.  The "you can have the CDR"
+** language is confusing me.
+
  - Libguile macro: struct env_funcs **SCM_ENV_DATA (ENV)
      Return the CDR of ENV, as a pointer to a pointer to an `env_funcs'
      structure.
 
 Environment Example
 -------------------
 
    [[perhaps a simple environment based on association lists]]
 
+** This would be really useful to complete!
+
+
 Switching to Environments
 =========================
 
+** Ok, now we're back at the meta-level
+
    Here's what we'd need to do to today's Guile to install the system
 described above.  This work would probably be done on a branch, because
 it involves crippling Guile while a lot of work gets done.  Also, it
 could change the default set of bindings available pretty drastically,
 so the next minor release should not contain these changes.
@@ -836,25 +953,27 @@
 
    The material here is just a sketch.  Don't take it too seriously.
 The point is that environments allow us to experiment without getting
 tangled up with the interpreter.
 
+** and hopefully at little/no performance cost.
+
 Modules of Guile Primitives
 ===========================
 
 Modules of Interpreted Scheme Code
 ==================================
 
    If a module is implemented by interpreted Scheme code, Guile
 represents it using several environments:
 
 the "local" environment
-     This environment holds all the definitions made locally by the
+     This environment holds all the bindings made locally by the
      module, both public and private.
 
 the "import" environment
-     This environment holds all the definitions this module imports from
+     This environment holds all the bindings this module imports from
      other modules.
 
 the "evaluation" environment
      This is the environment in which the module's code is actually
      evaluated, and the one closed over by the module's procedures, both
@@ -869,11 +988,18 @@
 
    Each of these environments is implemented using a separate
 environment type.  Some of these types, like the evaluation and import
 environments, actually just compute their bindings by consulting other
 environments; they have no bindings in their own right.  They implement
+
+** Is this really true for import environments?  Conflicts complicate
+** this issue.
+
 operations like `env-ref' and `env-define' by passing them through to
+
+** Will an import environment support env-define at all?
+
 the environments from which they are derived.  For example, the
 evaluation environment will pass definitions through to the local
 environment, and search for references and assignments first in the
 local environment, and then in the import environment.