Re: Regular expressions in lexing and parsing

Hans-Peter Diettrich <DrDiettrich1@netscape.net>
Tue, 21 May 2019 07:14:14 +0200

          From comp.compilers

Related articles
Regular expressions in lexing and parsing ed_davis2@yahoo.com.dmarc.email (Ed Davis) (2019-05-17)
Regular expressions in lexing and parsing jamin.hanson@googlemail.com (Ben Hanson) (2019-05-18)
Re: Regular expressions in lexing and parsing DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2019-05-21)
Re: Regular expressions in lexing and parsing Fortran sgk@troutmask.apl.washington.edu (steve kargl) (2019-05-22)
Re: Regular expressions in lexing and parsing drikosev@gmail.com (Ev. Drikos) (2019-05-23)
Re: Regular expressions in lexing and parsing christopher.f.clark@compiler-resources.com (Christopher F Clark) (2019-06-17)
Re: Regular expressions in lexing and parsing quinn.jackson@ieee.org (Quinn Jackson) (2019-06-18)
Re: Regular expressions in lexing and parsing quinn.jackson@ieee.org (Quinn Jackson) (2019-06-18)
Re: Regular expressions in lexing and parsing 847-115-0292@kylheku.com (Kaz Kylheku) (2019-06-18)
[1 later articles]
| List of all articles for this month |

From: Hans-Peter Diettrich <DrDiettrich1@netscape.net>
Newsgroups: comp.compilers
Date: Tue, 21 May 2019 07:14:14 +0200
Organization: Compilers Central
References: 19-05-092
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="92946"; mail-complaints-to="abuse@iecc.com"
Keywords: lex
Posted-Date: 21 May 2019 22:46:33 EDT

Am 17.05.2019 um 15:18 schrieb Ed Davis:


> So don't write lexers and parsers with regular expressions as the starting
> point. Your code will be faster, cleaner, and much easier to understand and to
> maintain.


I think that regular expressions are primarily usable in *formal* token
grammars. In contrast to parsers I'm missing *complete* formal grammars
for the tokens, including whitespace, comments, strings and other
special language elements, and how these fit together.


Here regular expressions are easier to understand and maintain than DFA
or NFA code for all those token types. The equivalent lexer
*implementation* finally is up to the compiler writer.


OTOH I wonder how a formal (regex?) FORTRAN lexer grammar could look like?


DoDi
[I've written Fortran 77 parsers. It was quite context sensitive, particularly
deciding whether a statement was an assignment or something else. e.g.


  FORMAT(I4) = 42
  DO 10 I = 1.23 (oops)


Modern Fortran no longer ignores spaces which makes lexing a lot easier. -John]



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.