Regular expressions speedup

Cleo Saulnier <cleos@nb.sympatico-dot-ca.remove>
5 Aug 2005 19:07:34 -0400

          From comp.compilers

Related articles
Regular expressions speedup cleos@nb.sympatico-dot-ca.remove (Cleo Saulnier) (2005-08-05)
Re: Regular expressions speedup (2005-08-07)
Re: Regular expressions speedup (Cleo Saulnier) (2005-08-07)
Re: Regular expressions speedup (2005-08-10)
Re: Regular expressions speedup (Paolo Bonzini) (2005-08-10)
Re: Regular expressions speedup (Tony Finch) (2005-08-10)
Re: Regular expressions speedup (2005-08-10)
[3 later articles]
| List of all articles for this month |

From: Cleo Saulnier <cleos@nb.sympatico-dot-ca.remove>
Newsgroups: comp.compilers
Date: 5 Aug 2005 19:07:34 -0400
Organization: Aliant Internet
Keywords: DFA, lex, question, comment
Posted-Date: 05 Aug 2005 19:07:34 EDT

I wrote my own Regular expressions parser CSRegEx for C++ (all OS) which
is now on sourceforge as public domain. You can access backreferences
and it also supports UNICODE. I wrote it for use in my LR(1) parser and
will release that too. The regular expressions parser converts every
pattern into a binary format (still used as a string). The matching
algorithm is non-recursive and backtracking. Are there any tips on how
to speed up the matching process. I was thinking for RE's that aren't
anchored to the start, that getting the *FIRST* set of chars (as in LL
and LR parsers) would perhaps simplify the initial scanning process.
Are there any other obvious things that can be done on a backtracking
[If you really care about speed, why not turn the NFA into a DFA so
you don't have to do multiple states and backtracking? -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.