This is the mail archive of the
glibc-bugs-regex@sources.redhat.com
mailing list for the glibc project.
[Bug regex/429] regex hangs on backreferences
- From: "paolo dot bonzini at lu dot unisi dot ch" <sourceware-bugzilla at sources dot redhat dot com>
- To: glibc-bugs-regex at sources dot redhat dot com
- Date: 9 Nov 2004 15:51:44 -0000
- Subject: [Bug regex/429] regex hangs on backreferences
- References: <20041007110031.429.jakub@redhat.com>
- Reply-to: sourceware-bugzilla at sources dot redhat dot com
------- Additional Comments From paolo dot bonzini at lu dot unisi dot ch 2004-11-09 15:51 -------
Subject: Re: regex hangs on backreferences
> But even with your patch bug-regex11.c with s/#if 0/#if 1/ eats certainly
> more than 10 minutes of CPU time (until I killed it).
Yes. I managed to run a 7-backreference version, which took a few minutes
before my patch. True, it does not make the full testcase feasible yet, but
I'll work on it.
> Either there is a better algorithm for many backreferences, or we should
> consider using NFA for patterns where DFA with backtracing is known to
take
> too long.
I think you can do some kind of caching to cut the number of invocations of
calc_dst_limits_pos. Its implementation is naive. Complexity is
exponential in N because every epsilon closure visits N backreferences
without any remote hope of succeeding, because no OP_{OPEN,CLOSE}_SUBEXP is
on the epsilon closure.
I have to figure out exactly how the backref cache enters the game, because
adding something ad hoc in regcomp.c does not seem the right way to fix it.
And also, I want to understand which cases are common both in practice (to
avoid slowing down the common case) and in the worst case: I'd like
factor.sed and dc.sed to be sped up by 10% while fixing this bug. I
certainly hope to bring it down to O(N) in the number of backreferences,
albeit with a pretty big constant in front of it.
Paolo
--
http://sources.redhat.com/bugzilla/show_bug.cgi?id=429
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.