Re: Interesting paper on regex NFA matching

cross@spitfire.i.gajendra.net
Fri, 26 Jan 2024 04:15:30 -0000

          From comp.compilers

Related articles
Interesting paper on regex NFA matching johnl@taugh.com (John R Levine) (2024-01-25)
Re: Interesting paper on regex NFA matching cross@spitfire.i.gajendra.net (2024-01-26)
Re: Interesting paper on regex NFA matching 433-929-6894@kylheku.com (Kaz Kylheku) (2024-01-26)
Re: Interesting paper on regex NFA matching christopher.f.clark@compiler-resources.com (Christopher F Clark) (2024-01-27)
Re: Interesting paper on regex NFA matching 433-929-6894@kylheku.com (Kaz Kylheku) (2024-01-27)
Re: Interesting paper on regex NFA matching christopher.f.clark@compiler-resources.com (Christopher F Clark) (2024-01-29)
Re: Interesting paper on regex NFA matching 433-929-6894@kylheku.com (Kaz Kylheku) (2024-01-29)
Re: Interesting paper on regex NFA matching christopher.f.clark@compiler-resources.com (Christopher F Clark) (2024-02-01)
[1 later articles]
| List of all articles for this month |

From: cross@spitfire.i.gajendra.net
Newsgroups: comp.compilers
Date: Fri, 26 Jan 2024 04:15:30 -0000
Organization: Compilers Central
References: 24-01-003
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="99463"; mail-complaints-to="abuse@iecc.com"
Keywords: lex, NFA, performance
Posted-Date: 26 Jan 2024 10:17:58 EST

In article 24-01-003, John R Levine <johnl@taugh.com> wrote:
>The easiest way to implement regular expressions is to turn them into
>NFAs, but that has well known problems that if you feed hostile input
>to the NFAs, they can take exponential time and it's a way to DDoS the
>server. If you translate the NFA to a DFA it runs in linear time, but
>if the regex is ambiguous, as many are, the DFA may match something
>different from the NFA.


NFAs won't take exponential time if implemented carefully,
though they may be superlinear. Still, we're talking quadratic,
not exponential.


Backtracking implementations can take exponential time, but
those aren't all NFAs.


Converting an NFA to a DFA could be exponential, however.


https://swtch.com/~rsc/regexp/regexp1.html


>This paper on the Arxiv preprint server proposes some rather complex
>tweaks to NFA regex matching to fix many performance issues while giving
>the same answers as before.
>
>https://arxiv.org/abs/2401.12639


If I'm reading the abstract right, they modify a backtracking
implementation so that they can efficiently match some extended
regex's. Note however, that such so-called "extended" regexs
are not really regular expressions in the formal sense; that is,
they do not describe members of the regular languages, but
rather languages beyond those expressible as a DFA. Regardless
this work is interesting, since AFAIK the speedup for this
class of regex to linear time is novel. Thanks for passing it
on.


- Dan C.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.