Re: How make multifinished DFA for merged regexps?

Kaz Kylheku <493-878-3164@kylheku.com>
Sat, 21 Dec 2019 04:04:43 +0000 (UTC)

          From comp.compilers

Related articles
How make multifinished DFA for merged regexps? borucki.andrzej@gmail.com (Andy) (2019-12-19)
Re: How make multifinished DFA for merged regexps? 493-878-3164@kylheku.com (Kaz Kylheku) (2019-12-20)
How make multifinished DFA for merged regexps? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2019-12-20)
Re: How make multifinished DFA for merged regexps? borucki.andrzej@gmail.com (Andy) (2019-12-20)
Re: How make multifinished DFA for merged regexps? 493-878-3164@kylheku.com (Kaz Kylheku) (2019-12-21)
How make multifinished DFA for merged regexps? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2019-12-23)
Re: How make multifinished DFA for merged regexps? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2019-12-24)
Re: How make multifinished DFA for merged regexps? matt.timmermans@gmail.com (Matt Timmermans) (2019-12-23)
Re: How make multifinished DFA for merged regexps? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2019-12-24)
Re: How make multifinished DFA for merged regexps? rockbrentwood@gmail.com (2019-12-29)
| List of all articles for this month |

From: Kaz Kylheku <493-878-3164@kylheku.com>
Newsgroups: comp.compilers
Date: Sat, 21 Dec 2019 04:04:43 +0000 (UTC)
Organization: Aioe.org NNTP Server
References: 19-12-005 19-12-010
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="81140"; mail-complaints-to="abuse@iecc.com"
Keywords: DFA
Posted-Date: 20 Dec 2019 23:07:43 EST

On 2019-12-21, Andy <borucki.andrzej@gmail.com> wrote:
> Greedy algorithms match longest regexp. For example operators "+" and "++",
> int numbers "123" and float numbers "123.456e3".
> On '.' will finish state of number, but we will inside automata for float
> number. But can be errors: after '.' will 'a'. We must backtrack to last
> finished state?


Yes, in general, to recognize the longest prefix of the input which
matches a regex, we must "backtrack". But this is nothing expensive like
recursive backtracking in pattern matching or logic programming.


Basically when we hit the situation that the input is exhausted, or
there are no transitions possible on the next character, we must jump
back to the position where the automaton most recently found itself in
an acceptance state. That's it; there is no further history: just one
item to remember. Whenever the machine is in an acceptance state after
consuming a symbol, then it records the position, overwriting the
previously recorded position.


The token is then formed up to that acceptance position, and material
after that is pushed back into the input; it's likely the start of
another token.


--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.