This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Scheme style auto-resizing hashtable


Jay Glascoe <jglascoe@jay.giss.nasa.gov> writes:

 > > STk has hashtable->list
 > 
 > forgive me, I still haven't downloaded STk.  I'm interested in knowing
 > what procedures STk provides for its hashtables.  (I was looking at CLtL
 > last night and to my surprise, it didn't have that many hashtable
 > procedures: get, set!, del!, hash->alist and friends, and some sort of
 > iterator thingy.)


It's about the same.  The docs are in Latex, but here's the hashtable
section.

\subsection{Hash tables}

\label{hashtables}
A hash table consists of zero or more entries, each consisting of a key and a
value. Given the key for an entry, the hashing function can very quickly
locate the entry, and hence the corresponding value. There may be at most one
entry in a hash table with a particular key, but many entries may have the
same value.

{\stk} hash tables grow gracefully as the number of entries increases, so that
there are always less than three entries per hash bucket, on average.  This
allows for fast lookups regardless of the number of entries in a table.

\vskip3mm
\begin{note}
Hash table manipulation procedures are built upon the efficient Tcl hash
table package.
\end{note}

\begin{entry}{
\proto{make-hash-table}{}{ procedure}
\proto{make-hash-table}{ comparison}{ procedure}
\proto{make-hash-table}{ comparison hash}{ procedure}}
\saut
\ide{Make-hash-table} admits three different forms. 
The most general form admit two arguments. The first argument is a
comparison function  which determine how keys are compared; the second
argument is a function which computes a hash code for an object and
returns the hash code as a non negative integer. Objets with the same
hash code are stored in an A-list registered in the bucket
corresponding to the key.

If omitted, 
\begin{itemize}
\item {\tt hash} defaults to the \ide{hash-table-hash}
procedure. 
\item {\tt comparison } defaults to the \ide{eq?} procedure
\end{itemize}

Consequently, 
\begin{scheme}
(define h (make-hash-table))
\end{scheme}
is equivalent to
\begin{scheme}
(define h (make-hash-table eq? hash-table-hash))
\end{scheme}
Another interesting example is 
\begin{scheme}
(define h (make-hash-table string-ci=? string-length))
\end{scheme}
which defines a new hash table which uses {\tt string-ci=?} for
comparing keys. Here, we use the string-length as a (very simple) 
hashing function. Of course, a function which gives a key depending 
of the characters composing the string gives a better repartition
and should probably enhance performances. For instance, the following
call to {\tt make-hash-table} should return a more efficient, even if
not perfect, hash table:
\begin{scheme}
(make-hash-table 
    string-ci=? 
    (lambda (s)
      (let ((len (string-length s)))
        (do ((h 0)  (i 0 (+ i 1)))
            ((= i len) h)
          (set! h (+ h (char->integer 
                         (char-downcase (string-ref s i)))))))))
\end{scheme}

\begin{note}
Hash tables with a comparison function equal to \ide{eq?} or {\tt
string=?} are handled in an more efficient way (in fact, they
don't use the \ide{hash-table-hash} fucntion to speed up hash table
retrievals).
\end{note}
\end{entry}
 
\begin{entry}{
\proto{hash-table?}{ obj}{ procedure}}
\saut
Returns {\schtrue} if \var{obj} is a hash table, returns {\schfalse} otherwise.
\end{entry}

\begin{entry}{
\proto{hash-table-hash}{ obj}{ procedure}}
\saut
\ide{hash-table-hash} computes a hash code for an object and returns
the hash code as a non negative integer. A property of {\tt
hash-table-hash} is that 
\begin{quote}
{\tt (equal? x y)} implies {\tt (equal? (hash-table-hash x) (hash-table-hash y)}
\end{quote}
as the the Common Lisp {\tt sxhash} function from which this procedure is
modeled.
\end{entry}

\begin{entry}{
\proto{hash-table-put!}{ hash key value}{ procedure}}
\saut
\ide{Hash-table-put!} enters an association between \var{key} and \var{value} in the
\var{hash} table. The value returned by \ide{hash-table-put!} is undefined.
\end{entry}

\begin{entry}{
\proto{hash-table-get}{ hash key}{ procedure}
\proto{hash-table-get}{ hash key default}{ procedure}}
\saut
\ide{Hash-table-get} returns the value associated with \var{key} in
the given \var{hash} table. If no value has been associated with \var{key} in
\var{hash}, the specified \var{default} is returned if given; otherwise an
error is raised.
\begin{scheme}
(define h1 (make-hash-table))
(hash-table-put! h1 'foo (list 1 2 3))
(hash-table-get  h1 'foo) \lev (1 2 3)
(hash-table-get  h1 'bar 'absent) \lev absent
(hash-table-get  h1 'bar) \lev \scherro                  (cons key value)))) \lev ((0 . "0") (3 . "3") (2 . "2") (1 . "1") (4 . "4"))
\end{scheme}
\end{entry}

\begin{entry}{
\proto{hash-table->list}{ hash}{ procedure}}
\saut %
\ide{hash-table->list} returns an ``association list'' built from the
entries in \var{hash}.  Each entry in \var{hash} will be represented
as a pair whose \var{car} is the entry's key and whose \var{cdr} is
its value.
\begin{note}
The order of pairs in the resulting list is unspecified.
\end{note}
\begin{scheme}
(let ((h (make-hash-table)))
  (dotimes (i 5) 
    (hash-table-put! h i (number->string i)))
  (hash-table->list h)) \lev ((0 . "0") (3 . "3") (2 . "2") (1 . "1") (4 . "4"))
\end{scheme}
\end{entry}

\begin{entry}{
\proto{hash-table-stats}{ hash}{procedure}}
\saut %
\ide{Hash-table-stats} returns a string with overall information about
\var{hash}, such as the number of entries it contains, the number of
buckets in its hash array, and the utilization of the buckets.
\end{entry}

-- 
Harvey J. Stein
BFM Financial Research
hjstein@bfr.co.il