Re: How make multifinished DFA for merged regexps?

Kaz Kylheku <493-878-3164@kylheku.com>
Fri, 20 Dec 2019 05:54:24 +0000 (UTC)

          From comp.compilers

Related articles
How make multifinished DFA for merged regexps? borucki.andrzej@gmail.com (Andy) (2019-12-19)
Re: How make multifinished DFA for merged regexps? 493-878-3164@kylheku.com (Kaz Kylheku) (2019-12-20)
How make multifinished DFA for merged regexps? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2019-12-20)
Re: How make multifinished DFA for merged regexps? borucki.andrzej@gmail.com (Andy) (2019-12-20)
Re: How make multifinished DFA for merged regexps? 493-878-3164@kylheku.com (Kaz Kylheku) (2019-12-21)
How make multifinished DFA for merged regexps? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2019-12-23)
Re: How make multifinished DFA for merged regexps? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2019-12-24)
Re: How make multifinished DFA for merged regexps? matt.timmermans@gmail.com (Matt Timmermans) (2019-12-23)
[2 later articles]
| List of all articles for this month |

From: Kaz Kylheku <493-878-3164@kylheku.com>
Newsgroups: comp.compilers
Date: Fri, 20 Dec 2019 05:54:24 +0000 (UTC)
Organization: Aioe.org NNTP Server
References: 19-12-005
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="51125"; mail-complaints-to="abuse@iecc.com"
Keywords: lex, DFA
Posted-Date: 20 Dec 2019 10:53:28 EST

On 2019-12-20, Andy <borucki.andrzej@gmail.com> wrote:
> I can create DFA direct from regexp.
> But for language lexer I must have DFA for couple regexp.


These are just combined as if with |, in some way that retains the
association between acceptance states an rules.


If we go NFA to DFA, what we can do is compile the individual regexes to
NFA, and then combine them into a big NFA. Each of the little baby
NFA's has its own NFA acceptance state(s), We associate those state with the
original regex rule.


When we go to DFA, we are now looking at subsets of NFA states. A state
of the DFA is a set of the states of the NFA.


Thus a DFA state can potentially contain one or more NFA acceptance
states. Since those NFA states are each associated with a lexer rules,
that means that, transitively, a DFA acceptance state is associated with
one or more rules.


If a DFA acceptance state is associated with more than one rule, we can
trim away all but the topmost rule. When that state occurs (that token
is matched) we trigger the first rule, ignoring the others.
That's how it is in Lex: if two or more rules match the same input, the
first rule appearing in the Lex program dominates.


Furthermore, after trimming away the superfluous rules, we can identify
all rules that are then no longer attached to the DFA and warn the
programmer about them.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.