This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Updated: sed-4.1.3-1

On Mon, 31 Jan 2005, Luke Kendall wrote:

> On 29 Jan, Corinna Vinschen wrote:
> >  * regex addresses do not use leftmost-longest matching.  In other words,
> >    /.\+/ only looks for a single character, and does not try to find as
> >    many of them as possible like it used to do.
> Interesting: does that mean every existing script that relied on the old
> behaviour must change?  I'm glad I stuck with the old "/..*/" notation
> when I wanted one or more repetitions!

I believe you are confused here.  Yes, every script that *relied* on the
old behavior will have to change, but the number of those is vastly
smaller than you seem to think.  Very few scripts actually rely on this;
the only ones that will behave differently are scripts like

	sed -e '/^\(.\+\)/s//---\1---/'

where the regex address pattern is saved and used in the subsequent
replacement (and is not anchored on the right side).  The above script
will turn "abcde" into "---a---bcde" with the new behavior, and
"---abcde---" with the old one.  Note that the pattern has to be
unanchored on the right for the behavior to change; the behavior of

	sed -e '/^\(.\+\)$/s//---\1---/'

should stay the same.  BTW, the latter script *is* the way to fix for the
former (they were equivalent under the old behavior).

> So \+ now works the opposite of * (\+ = shortest, * = longest)?  And .\+
> is now a synonym for a single "."?  So, why would you use .\+?

No, .\+ still means "one or more".  It's just when you say

	sed -e '/^abc.\+/d'

to delete all lines that start with "abc", sed will no longer have to go
through the whole line to determine that it starts with "abc" (as it used
to).  Note that the above was a pretty silly way of writing this anyway,
as '/^abc./d' would have sufficed.

> Ah, I see, it's a way of matching zero or one occurrences.  I would have
> thought a new symbol would have made more sense for the new semantics,
> so as to preserve backward compatibility.
> Probably I've misunderstood.

I believe so.  Unless I, too, am totally confused.
      |\      _,,,---,,_
ZZZzz /,`.-'`'    -.  ;-;;,_
     |,4-  ) )-,_. ,\ (  `'-'		Igor Pechtchanski, Ph.D.
    '---''(_/--'  `-'\_) fL	a.k.a JaguaR-R-R-r-r-r-.-.-.  Meow!

"The Sun will pass between the Earth and the Moon tonight for a total
Lunar eclipse..." -- WCBS Radio Newsbrief, Oct 27 2004, 12:01 pm EDT

Unsubscribe info:
Problem reports:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]