This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Re: . in for


Hi Dimitre,

> 1. 'mapping operator' must mean "a map function, which is used in an
> infix notation".
>
> So instead of writing:
>
>  map fun list
>
> we now write:
>
>  list `map` fun      (let's ignore that this must be the other way around).
>
> However, the above defined mapping operator stil operates with
> ***any function*** defined on lists.
>
> At the same time the mapping operator as you define it will operate
> only on valid XPath expressions. This is quite limited compared to
> the full mapping operator, therefore it would be best if the name of
> the 'limited mapping operator' properly reflected this difference in
> scope.

Absolutely. I suggested a simple mapping operator as a replacement for
the more complex for expression that is currently defined in XPath
2.0. I fully recognise the fact that a simple mapping operator gains
in simplicity, but loses in completeness - it cannot achieve
everything that the for expression can achieve (it can't do joins).
I'm sure that you're right that it also cannot achieve everything that
a full mapping operator would achieve (perhaps you can give some
examples?).

(As I don't have your expertise, I cannot tell whether the for
expression as currently defined can do everything a full mapping
operator can do - perhaps you could comment on that?)

My thinking in suggesting a simple mapping operator was that it would
meet over 80% of the practical requirements for a mapping operator
that is used in XPath; the remainder can be handled in XSLT.

> 2. The example you provide of "piping" is actually a functional
> composition of two map functions. It can be re-written like this:
>
>
>    (map (if (position() mod 2) then . + 50 else .)) ¡Ñ  (map (. * 2)) $coordinates 
>
> Here I use the special symbol ⊙ for the functional composition operator.
>
> So, the piping can be expressed as:
>
>      (map f1)
>   ¡Ñ (map f2)
>  ............               
>   ¡Ñ (map fN)   $sequence
>
> It can easily be proven that the above is equal to:
>
> map (f1 ¡Ñ f2 ...  ¡Ñ fN) $sequence
>
> While your mapping operator will perform a series of mappings, each
> producing an intermediate sequence and may require too much memory,
> the last function applies the map function only once. The
> composition of all functions is applied on every element of
> $sequence and the resulting sequence is produced. No additional
> memory for intermediate sequences is necessary.

That's a *very* interesting point :) As I understand it, you're saying
that it would be more efficient if the two-step mapping was carried
out in one step, and showing that you can compose the two steps:

 $coordinates -> (. * 2)
              -> if (position() mod 2) then . + 50 else .

into one:

 $coordinates -> if (position() mod 2) then (. * 2) + 50 else (. * 2)

Presumably the same observation applies to the equivalent for
expression?

  for $c in (for $d in $coordinates return $d * 2)
  return if (position() mod 2) then $c + 50 else $c

This raises two questions/observations for me.

The first question arises because of the fact that in XPath 2.0
sequences cannot contain sequences. This means that the result of a
mapping operation does not necessarily have the same cardinality as
the source of the mapping operation. As an example, say that the
coordinates that we're dealing with are actually x,y,z coordinates,
but we're only interested in the x,y coordinates. We could get rid of
the z coordinates in one step, as follows:

 $coordinates -> if (position() mod 3) then . else ()
              -> (. * 2)
              -> if (position() mod 2) then . + 50 else .

Which I think in your syntax is:

  (map (if (position() mod 2) then . + 50 else .)) ¡Ñ
  (map (. * 2)) ¡Ñ
  (map (if (position() mod 3) then . else ())) $coordinates

Using the same algorithm as before to compose these steps, I think the
expression is:

  if (position() mod 2)
  then ((if (position() mod 3) then . else ()) * 2) + 50
  else ((if position() mod 3) then . else ()) * 2)

But this doesn't work - it adds 50 to the even coordinates from the
complete x,y,z coordinate list and operates on them. So if you had a
line defined as:

  (0, 0, 0, 50, 50, 50)

then you would get (empty sequences inserted to demonstrate the
mapping):

  (50, 0, (), 100, 150, ())  =>  (50, 0, 100, 150)

whereas the expected result from the individual steps was:

  (50, 0, 150, 100)

Now I might have gone wrong in understanding how functions are
composed, which is why I ask the question - can you do function
composition in XPath 2.0, given that the cardinality of the sequence
can change when a map operation is used and that the position of an
item in the sequence should be accessible within the function?


My second question is, whether the above issue is manageable or not,
whether implementations could be clever enough to spot where the use
of simple mapping operators is equivalent to functional composition,
in order to optimise these expressions? Perhaps an implementation
could recognise that given:

  $coordinates -> (. * 2)
               -> if (position() mod 2) then . + 50 else .

it could compose the two steps on the right hand side of the first ->
into a single operation, and use that to give added efficiency?

I also wonder if this can apply to steps in path expressions as well,
and whether any XSLT processors currently take advantage of that. You
could imagine that as well as keeping a record of the nodes reachable
along a particular axis for each node, they might also keep
information about the composition of those axes - when they see:

 row[following-sibling::row/@cust = @cust]

they might optimise by providing a quick access to each row's
following sibling rows' 'cust' attributes when they initially
construct the in-memory representation of the node tree. Mike, if
you're listening, does Saxon do that? Have you considered it?

> This shows that it is better to have a map() function and a
> composition operator for expressions (in case XPath 2.0 will not
> fully support higher-order functions).

I think that you also need to have functions/expressions as objects in
their own right to get a map() function to work as a function, don't
you? As I see it, without functions/expressions as objects, we're
stuck with using operators or expressions and unfortunately can't
express the map() function in functional syntax.

> An operator for functional composition has an added benefit that it
> can be used with ***any functions***, not only with functions that
> operate on sequences. Thus, it can be used to express an arbitrary
> sequence of processing (piping) applied on a value of any datatype.

I agree - the ability to apply a series of operations to values,
expressing those operations as individual steps rather than a single
complex operation is really very powerful.

For example, start with an element, get its string value (default to
'FOO' if it's empty), normalize it, change it to upper case, and check
whether the result equals "FOO". As a composed function this would be:

  upper-case(normalize-space(if-empty(., 'FOO'))) = 'FOO'

Expressed as a series of steps this would be:

  if-empty(., 'FOO') -> normalize-space(.)
                     -> upper-case(.)
                     -> (. = 'FOO')

Fortunately, with the simple mapping operator this works, because in
XPath 2.0 individual values are treated exactly the same as
single-item sequences.

For interest, the equivalent for expression is as follows:

  for $a in (for $b in (for $c in if-empty(., 'FOO')
                        return normalize-space($c))
             return upper-case($b))
  return ($a = 'FOO')


I *think* that if you view the series of operations on the right hand
side of the simple mapping operator as being functionally composed,
then you have the functional composition operator that you're looking
for?

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]