17 Jul 2001 23:21:54 -0400

Related articles |
---|

Generalized operator-precedence parsing joachim_d@gmx.de (Joachim Durchholz) (2001-06-28) |

Generalized operator-precedence parsing joachim_d@gmx.de (Joachim Durchholz) (2001-07-06) |

Re: Generalized operator-precedence parsing gtoomey@usa.net (Gregory Toomey) (2001-07-17) |

Re: Generalized operator-precedence parsing David.Chase@naturalbridge.com (David Chase) (2001-07-17) |

Re: Generalized operator-precedence parsing dmitry@elros.cbb-automation.de (2001-07-17) |

Re: Generalized operator-precedence parsing peter@abbnm.com (2001-07-17) |

Re: Generalized operator-precedence parsing slimick@jcs.upb.pitt.edu (2001-07-18) |

Re: Generalized operator-precedence parsing mickunas@cs.uiuc.edu (Dennis Mickunas) (2001-07-23) |

From: | David Chase <David.Chase@naturalbridge.com> |

Newsgroups: | comp.compilers |

Date: | 17 Jul 2001 23:21:54 -0400 |

Organization: | Compilers Central |

References: | 01-07-053 |

Keywords: | parse |

Posted-Date: | 17 Jul 2001 23:21:54 EDT |

*> Does anybody know about generalizations of operator-precedence parsing?*

*> Any information (preferrably available on the Internet) would be*

*> appreciated.*

I saw your first post, meant to reply, but never quite got there. I'm

not 100% sure what you mean by "generalization". I thought your

syntax was a nice way to talk about prefix, postfix, infix, and

"surround-fix" expressions. I wrote an operator-precedence parser

(for FP84, in BCPL) that allowed the on-the-fly injection of new

operators, but your syntax for adding new operators is nicer.

I did find the (15 year old) source code listing for an OPP for

Fortran expressions. Sort of an interesting historical artifact,

though it isn't the finished product (in particular, it still has the

"all undeclared identifiers are undimensioned integers" stub in it).

Anyhow, I think your guesses about what to do with "surround-fix" are

roughly correct. The precedence of a closing parentheses is really

only interesting when there is an opening parentheses to match with

it, and at that point, it is as low as necessary to drive the enclosed

expressions to their evaluated form. There's obvious other

constraints that apply -- obviously, "[ ... )" (right here is where I

hit control-E to move to the end-of-line in Emacs, only I was

composing mail in Eudora. If I could disable that key, I would.) is

ill-formed, and should be rejected.

The generalization, something along the lines of

IF exp THEN exp ELSE exp

I think has the the THEN first playing the role of ")", but afterwards

the IF ... THEN acts as an opening parentheses to the trailing ELSE,

followed finally by the expression. If you let users define such

syntax, you have to consider whether you let them also say "IF exp

THEN exp ..." (you can imagine this in a language with "no value" as

an outcome) and then ensure that

IF exp THEN IF exp THEN exp ELSE exp

parses "properly", which (as I recall) is

(IF exp THEN (IF exp THEN exp ELSE exp ) )

but I think this would work that way if you simply followed the

treatment of "IF ... THEN" as an opening parentheses for a trailing

ELSE -- it would bind to the closer one.

(Yes, you could have a real party with an operator precedence parser.)

I think you also have to be a little careful. I am not a parsing

expert, but what I have heard is that OPPs can accept larger-than-

intended languages. I don't recall the example, and (given that

parsing atrocities that we've seen since I didn't learn enough about

parsing) it's not clear that anyone cares. I do have one opinion that

I think is worth listening to, which is that C (and hence, C++, and

Java) have too many prefix operators. In particular, casting should

not be a prefix operator, it should be some sort of a suffix operator,

so that you can read an expression from left to right and not worry

about how the parentheses group or what the relative precedence of

casting and other things is.

I worry a bit, also, that you might find yourself wanting to (ahem)

associate productions and what-not with your operator specifications,

so that you might say something like

. + . = plus(.,.)

so then you'll want to start identifying those non-terminals

<expr_1> + <expr_2> := plus(expr_1,expr_2)

and then you'll probably want to start attaching other information to

them, like whether they are expressions or statements, along the lines

of

WHILE <expr_1> DO <statement_1> OD := choose_your_semantic_poison

so that you can forbid (assuming you think this is a good

idea, which I don't always) things like

x := 1 + DO expression UNTIL done OD

(It would make more sense to forbid this if it is a zero-trip loop).

David Chase

--

David.Chase@NaturalBridge.com

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.