|Object-oriented parsers email@example.com (1991-08-29)|
|Re: Object-oriented parsers firstname.lastname@example.org (1991-08-29)|
|Re: Object-oriented parsers email@example.com (1991-08-30)|
|Re: Object-oriented parsers firstname.lastname@example.org (1991-08-30)|
|Object-oriented parsers email@example.com (1991-09-02)|
|Re: Object-oriented parsers firstname.lastname@example.org (1991-09-03)|
|From:||email@example.com (Steve Boswell)|
|Date:||29 Aug 91 05:00:38 GMT|
This is being posted to comp.compilers (because it deals with parsing),
comp.object (because it deals with OOP), and comp.lang.eiffel (because it
shows a real practical use for a language extension I suggested).
>From the articles I've seen, it doesn't look like anyone really has a good
idea of what an object-oriented parser would be like. Let me suggest this
line of thinking.
Lexical/syntactic/semantic style parsing, while affording division of labor,
is really only suited to procedural programming. bison++ (and yacc++ too, I
assume) is not really object-oriented, since it uses a class for data hiding
and encapsulation only, rather than the more important dynamic binding
properties. A completely different strategy is needed. Here is an example
of a lexical class under my paradigm.
READ_ONLY WRITABLE FUNCTION PROCEDURE
\ / | /
\ / | /
V V V V
(VALUE could inherit from CONSTANT_BASE, and READ_ONLY could inherit from
EXPRESSION_BASE also. just to give you an idea of what a final inheritance
network might look like.) Thus, once an identifier is lexed, the symbol
table can be consulted and the proper descendent of IDENTIFER_BASE can be
created and attached to IDENTIFER (or returned by IDENTIFIER if IDENTIFIER is
a pure parsing class.) IDENTIFIER is allowed to know about all the
descendents of IDENTIFIER_BASE. Then dynamic binding on the resulting types
can be used to weed through the possibilities in whatever rule is being
parsed. Here is a (much simpler) example of a rule that would do this.
ASSIGNMENT IF SWITCH PROCEDURE_CALL
\ \ | /
\ \ | /
\ \ | /
V V V V
The parsing method of STATEMENT would have to tell whether an initial
IDENTIFIER starts an ASSIGNMENT or PROCEDURE_CALL. If it's a WRITABLE, then
ASSIGNMENT is parsed; and if it's a PROCEDURE, then PROCEDURE_CALL is parsed.
In CLOS (which looks like the only suitable language for this) it'd be
something like this. Assume that get-next-token is a function that returns
the next token and terminal is the base class of all terminals (lex tokens)
in the language.
(defmethod parse ((this statement))
(parse this (get-next-token)))
(defmethod parse ((this statement) (target writable))
...parse an assignment statement)
(defmethod parse ((this statement) (name procedure_call))
...parse a procedure call)
(defmethod parse ((this statement) (token if_token))
...parse an if statement)
(defmethod parse ((this statement) (token switch_token))
...parse a switch statement)
(defmethod parse ((this statement) (token terminal))
...signal a parse error)
So by dynamic binding, the correct parsing method would be called.
For those of you on comp.lang.eiffel, this sort of structure could be faked
together by putting all the parse methods in STATEMENT_BASE and having the
user program explicitly calculate the dynamic binding, but then why bother to
design your parser like this at all? The language extension I suggested
would lend itself to this quickly.
Return to the
Search the comp.compilers archives again.