Re: Languages that are hard to parse

glen herrmannsfeldt <gah@ugcs.caltech.edu>
20 May 2005 16:04:11 -0400

          From comp.compilers

Related articles
Re: C++ intermediate representation. comeau@panix.com (2005-05-15)
RE: C++ intermediate representation. quinn-j@shaw.ca (Quinn Tyler Jackson) (2005-05-15)
Languages that are hard to parse steve@rh12.co.uk (Steve) (2005-05-16)
Re: Languages that are hard to parse DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-05-18)
Re: Languages that are hard to parse gah@ugcs.caltech.edu (glen herrmannsfeldt) (2005-05-18)
Re: Languages that are hard to parse Peter_Flass@Yahoo.com (Peter Flass) (2005-05-19)
Re: Languages that are hard to parse gah@ugcs.caltech.edu (glen herrmannsfeldt) (2005-05-20)
Re: Languages that are hard to parse DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-05-20)
Re: Languages that are hard to parse henry@spsystems.net (2005-05-21)
Re: Languages that are hard to parse gah@ugcs.caltech.edu (glen herrmannsfeldt) (2005-05-22)
Re: Languages that are hard to parse Satyam@satyam.com.ar (Satyam) (2005-05-22)
Re: Languages that are hard to parse DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-05-22)
Re: Languages that are hard to parse dot@dotat.at (Tony Finch) (2005-05-24)
[6 later articles]
| List of all articles for this month |

From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Newsgroups: comp.compilers
Date: 20 May 2005 16:04:11 -0400
Organization: Compilers Central
References: 05-05-119 05-05-125 05-05-147 05-05-155 05-05-166 05-05-175
Keywords: parse, design, comment
Posted-Date: 20 May 2005 16:04:11 EDT

Peter Flass wrote:


(snip)
(snip regarding how easy Fortran and PL/I are to parse.)


> Not really as much of a problem as it's said to be. Naturally a
> top-down or recursive-descent parser is impossible, but otherwise it's
> quite clean.


Well, it seems to me that these days the question is, how easy is it
to parse using lex/yacc. As I understand it, both Fortran and PL/I
aren't easy to do with lex/yacc.


>>[Fortran is actually pretty easy to parse. It's the lexer that's
>>hard since spaces don't matter (or didn't until recent versions of
>>Fortran.) I gather that PL/I isn't all that hard to parse once you
>>do some hackery (not unlike Fortran's) to decide whether a statement
>>is an assignment or starts with a keyword. It's just bulky. -John]


I believe there is a group working on a GNU, or at least GNU licensed
PL/I compiler. For some time they were writing and debugging just the
parser, and asking for sample programs to test it on.


It is so hard to remember today how small computers used to be. The
original PL/I compiler was designed to run in 44K, though not run fast
in a region that small. With some space for I/O buffers that doesn't
leave much room.


There are some people now trying to get gcc to run on S/370, and it
seems that there are problems getting it to fit into 12M or so, the
largest region they can get under MVS.


-- glen


[I've written a Fortran 77 compiler front end that ran on PDP-11 Unix
with 64K instructions and data, using the C compiler's back end and
assembler. As I recall, it wasn't anywhere close to the maximum size.
The lexer was hand-crafted, but the parser was ordinary yacc. PL/I is
easy to tokenize and straightforward to parse once you deal with the
issue of what's a reserved word. That requires some lookahead, but
nothing too bad. Re GCC's size, that's not the parser, that's their
deliberate design decision that memory is free so they slurp
everything into core and build large data structures all the
time. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.