This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

engineering an opposable thumb for guile?


Sirs,

I think the following missing guile feature is SO IMPORTANT that its lack
should be considered a bug of the highest order.  Portability is good, but
practical engineering use requires that the following "opposable thumb" be
added to guile.  Its lack certainly stops my guile usage in its tracks.  :-(

Thank you so much for your consideration!



SUMMARY:

I believe Guile needs flexible versions of perl's "pack" and "unpack"
functions.  Built in the spirit of Scheme, of course.

CONTEXT:

My work is in the messy world of practical seat-of-the-pants electrical
engineering.  Often I cannot adapt problems to myself but must adapt myself to
those problems.

The current problem, practically solvable only with perl, consists of reading
and writing 4-byte or 8-byte binary floating point numbers from a file.  Here
is some perl code which does this:
perl -e 'undef $/; $f = <>; print join( "\n", unpack "f*", $f ), "\n"' file-of-4-byte-binary-numbers

Guile appears to be missing a similar type of "opposable thumb".  :-(

PROPOSED SOLUTION:

Scheme should be able to generate clean and efficient code.  I don't believe
that this is only possible in C and perl.

- Should support general "endianness".
- Should support *all* IEEE floats that the hardware platform supports
  (4-byte, 8-byte, extended, etc).
- Should support at least the perl-like functionality marked below ...
- ... but should do it with a clean Scheme design.
- Should support *all* hardware integer values.  But saying "long" is not
  enough because one machine's "long" is 4 bytes while another is 8 bytes.  A
  lot of issues need to be considered.  And don't forget "long long".
- Don't forget the sizes of various pointers, including function pointers.
- Compiler alignment issues need to be considered because of padding.  This is
  the "padding you get when you specify XYZ option."
- Explicit padding (mentioned below with the "x" template format character.
- Perhaps should issue "warning: machine dependent code!" messages when used.

... I don't know.  Bigger brains than mine need to think about this.

:-)


ADDITIONAL DOCUMENTATION:

The following documentation is from the "perlfunc" manpage.

pack TEMPLATE,LIST
     Takes an array or list of values and packs it into a binary
     structure, returning the string containing the structure.  The
     TEMPLATE is a sequence of characters that give the order and type of
     values, as follows:

          A     An ascii string, will be space padded.
          a     An ascii string, will be null padded.
          b     A bit string (ascending bit order, like vec()).
          B     A bit string (descending bit order).
          h     A hex string (low nybble first).
          H     A hex string (high nybble first).

          c     A signed char value.
          C     An unsigned char value.
          s     A signed short value.
          S     An unsigned short value.
          i     A signed integer value.
          I     An unsigned integer value.
          l     A signed long value.
          L     An unsigned long value.

          n     A short in "network" order.
          N     A long in "network" order.
          v     A short in "VAX" (little-endian) order.
          V     A long in "VAX" (little-endian) order.

          f     A single-precision float in the native format.
          d     A double-precision float in the native format.

          p     A pointer to a null-terminated string.
          P     A pointer to a structure (fixed-length string).

          u     A uuencoded string.

          x     A null byte.
          X     Back up a byte.
          @     Null fill to absolute position.

     Each letter may optionally be followed by a number which gives a
     repeat count.  With all types except "a", "A", "b", "B", "h" and "H",
     and "P" the pack function will gobble up that many values from the
     LIST.  A * for the repeat count means to use however many items are
     left.  The "a" and "A" types gobble just one value, but pack it as a
     string of length count, padding with nulls or spaces as necessary.
     (When unpacking, "A" strips trailing spaces and nulls, but "a" does
     not.)  Likewise, the "b" and "B" fields pack a string that many bits
     long.  The "h" and "H" fields pack a string that many nybbles long.
     The "P" packs a pointer to a structure of the size indicated by the
     length.  Real numbers (floats and doubles) are in the native machine
     format only; due to the multiplicity of floating formats around, and
     the lack of a standard "network" representation, no facility for
     interchange has been made.  This means that packed floating point
     data written on one machine may not be readable on another - even if
     both use IEEE floating point arithmetic (as the endian-ness of the
     memory representation is not part of the IEEE spec).  Note that Perl
     uses doubles internally for all numeric calculation, and converting
     from double into float and thence back to double again will lose
     precision (i.e.  `unpack("f", pack("f", $foo)') will not in general
     equal $foo).

     Examples:

          $foo = pack("cccc",65,66,67,68);
          # foo eq "ABCD"
          $foo = pack("c4",65,66,67,68);
          # same thing

          $foo = pack("ccxxcc",65,66,67,68);
          # foo eq "AB\0\0CD"

          $foo = pack("s2",1,2);
          # "\1\0\2\0" on little-endian
          # "\0\1\0\2" on big-endian

          $foo = pack("a4","abcd","x","y","z");
          # "abcd"

          $foo = pack("aaaa","abcd","x","y","z");
          # "axyz"

          $foo = pack("a14","abcdefg");
          # "abcdefg\0\0\0\0\0\0\0"

          $foo = pack("i9pl", gmtime);
          # a real struct tm (on my system anyway)

          sub bintodec {
                unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
          }

     The same template may generally also be used in the unpack function.



unpack TEMPLATE,EXPR
     Unpack does the reverse of pack: it takes a string representing a
     structure and expands it out into a list value, returning the array
     value.  (In a scalar context, it merely returns the first value
     produced.)  The TEMPLATE has the same format as in the pack function.
     Here's a subroutine that does substring:

          sub substr {
                local($what,$where,$howmuch) = @_;
                unpack("x$where a$howmuch", $what);
          }

     and then there's

          sub ordinal { unpack("c",$_[0]); } # same as ord()

     In addition, you may prefix a field with a %<number> to indicate that
     you want a <number>-bit checksum of the items instead of the items
     themselves.  Default is a 16-bit checksum.  For example, the
     following computes the same number as the System V sum program:

          while (<>) {
                $checksum += unpack("%16C*", $_);
          }
          $checksum %= 65536;

     The following efficiently counts the number of set bits in a bit
     vector:

          $setbits = unpack("%32b*", $selectmask);

Thanks!

-- 
Daniel Ortmann, IBM Circuit Technology, Rochester, MN 55901-7829
ortmann@vnet.ibm.com or ortmann@us.ibm.com and 507.253.6795 (external)
ortmann@rchland.ibm.com and tieline 8.553.6795 (internal)
ortmann@isl.net and 507.288.7732 (home)

"The answers are so simple, and we all know where to look,
but it's easier just to avoid the question." -- Kansas

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]