recursive-descent error recovery

cullvax!drw@EDDIE.MIT.EDU (Dale Worley)
17 Aug 87 15:45:35 GMT

          From comp.compilers

Related articles
recursive-descent error recovery decvax!utzoo!henry (1987-08-13)
Re: recursive-descent error recovery chuck@amdahl.amdahl.com (1987-08-15)
Re: recursive-descent error recovery decvax!utzoo!henry (1987-08-17)
Re: recursive-descent error recovery harvard!seismo!cs.rochester.edu!scott (1987-08-16)
recursive-descent error recovery cullvax!drw@EDDIE.MIT.EDU (1987-08-17)
Re: recursive-descent error recovery decvax!utzoo!henry (1987-08-22)
| List of all articles for this month |

From: cullvax!drw@EDDIE.MIT.EDU (Dale Worley)
Newsgroups: comp.compilers
Date: 17 Aug 87 15:45:35 GMT
Organization: Cullinet Software, Westwood, MA, USA

OK, he wants flames...


chuck@amdahl.amdahl.com (Charles Simmons) writes:
> A while ago I suggested that one cannot write a reasonable
> compiler using YACC. ... I claim it is impossible to write a reasonable
> compiler using YACC. ... ... I further claim it is impossible to produce
> reasonable error messages using YACC.


I've been working on a production compiler that uses Yacc for parsing,
and I've given a bit of thought to this. I claim that it is indeed
possible to produce good syntax error messages. The strategy is to
print out several things:


1. a message that a syntax error has been found
2. a pointer to the token which caused the parser to barf
(This is extremely important -- in practice this pointer is
almost all a programmer needs. Compare this to the usual
style: "line 37: invalid declaraction" when line 37 contains
a nasty mess of structs and unions.)
3. a description of the token that was found
(In case it tokenizes differently than the programmer expected.)
4. a listing of the tokens that are valid in this position
(This gets long in some cases (expression operand expected,
operator expected, and statement verb expected), so some
special case code is needed to condense these forms.)
This functions as a quick refresher of the grammar for the
programmer. This isn't so important for languages like C with
a small grammar, but its very useful for languages like ours that
have over 50 different statements. (uck!)
5. a summary of where the parse stands now:
statement-list ; l-value = expression + (


All of this stuff is immediately accessible from Yacc's tables.


I'll admit that all of this (except 5) can be done in recursive
descent, but usually people who code recursive descent don't bother.
For example, in one popular C compiler, the construction


struct { int a, reel /* <- note misspelling */ b } ;


produces the message "closing brace expected", because when the parser
sees 'reel' (which the lexer returns as 'identifier' rather than 'type
name'), it says "Aha! He forgot to close off his structure
declaraction." I'd much rather the compiler just tell me that I made
an error, and not try to guess what I meant.


As far as size and speed, I doubt that a recursive-descent parser for
a complex grammar (say, 750 Yacc states) would be anywhere near as
small, because the number of read-and-test code segments in an RD
parser has to be about as large as the number of Yacc states. There
are about 4 int's in Yacc's tables for each state, which is a lot more
compact than the code you're going to generate to read-token-and-test.


As far as needing an external stack for looping constructs, give me a
break! Yacc maintains a stack, use it!


/* generating the code for the expressions in the order they are
  * written requires more goto's than necessary, but otherwise I have
  * to write code to read and store them and then emit them later */
for_statement : FOR '(' expression ';'
$remember_loc expression $goto_on_false $goto ';'
$remember_loc expression $goto ')'
$remember_loc statement $goto
{
/* if test is false, exit for */
backpatch($7, current_object_location);
/* if test is true, go to body */
backpatch($8, $14);
/* after step expression, go to test */
backpatch($12, $5);
/* after body, go to step expression */
backpatch($16, $10);
} ;
$remember_loc : { $$ = current_object_location } ;
$goto : { emit a goto statement, $$ = location where it's
destination location will be filled in later by backpatch() };
$goto_on_false : { emit a conditional goto, $$ = destination location } ;


--
Dale Worley Cullinet Software ARPA: cullvax!drw@eddie.mit.edu
UUCP: ...!seismo!harvard!mit-eddie!cullvax!drw
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.