This is the mail archive of the
mailing list for the Cygwin project.
Re: Updated: sed-4.1.3-1
- From: Igor Pechtchanski <pechtcha at cs dot nyu dot edu>
- To: Luke Kendall <luke dot kendall at cisra dot canon dot com dot au>
- Cc: cygwin at cygwin dot com
- Date: Sun, 30 Jan 2005 22:26:55 -0500 (EST)
- Subject: Re: Updated: sed-4.1.3-1
- References: <20050130223445.729458570A@pessard.research.canon.com.au>
- Reply-to: cygwin at cygwin dot com
On Mon, 31 Jan 2005, Luke Kendall wrote:
> On 29 Jan, Corinna Vinschen wrote:
> > * regex addresses do not use leftmost-longest matching. In other words,
> > /.\+/ only looks for a single character, and does not try to find as
> > many of them as possible like it used to do.
> Interesting: does that mean every existing script that relied on the old
> behaviour must change? I'm glad I stuck with the old "/..*/" notation
> when I wanted one or more repetitions!
I believe you are confused here. Yes, every script that *relied* on the
old behavior will have to change, but the number of those is vastly
smaller than you seem to think. Very few scripts actually rely on this;
the only ones that will behave differently are scripts like
sed -e '/^\(.\+\)/s//---\1---/'
where the regex address pattern is saved and used in the subsequent
replacement (and is not anchored on the right side). The above script
will turn "abcde" into "---a---bcde" with the new behavior, and
"---abcde---" with the old one. Note that the pattern has to be
unanchored on the right for the behavior to change; the behavior of
sed -e '/^\(.\+\)$/s//---\1---/'
should stay the same. BTW, the latter script *is* the way to fix for the
former (they were equivalent under the old behavior).
> So \+ now works the opposite of * (\+ = shortest, * = longest)? And .\+
> is now a synonym for a single "."? So, why would you use .\+?
No, .\+ still means "one or more". It's just when you say
sed -e '/^abc.\+/d'
to delete all lines that start with "abc", sed will no longer have to go
through the whole line to determine that it starts with "abc" (as it used
to). Note that the above was a pretty silly way of writing this anyway,
as '/^abc./d' would have sufficed.
> Ah, I see, it's a way of matching zero or one occurrences. I would have
> thought a new symbol would have made more sense for the new semantics,
> so as to preserve backward compatibility.
> Probably I've misunderstood.
I believe so. Unless I, too, am totally confused.
|\ _,,,---,,_ email@example.com
ZZZzz /,`.-'`' -. ;-;;,_ firstname.lastname@example.org
|,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D.
'---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow!
"The Sun will pass between the Earth and the Moon tonight for a total
Lunar eclipse..." -- WCBS Radio Newsbrief, Oct 27 2004, 12:01 pm EDT
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html