Re: Parsing infix notation

"Jonathan Eifrig" <eifrig@acm.org>
12 Sep 1997 21:14:09 -0400

From comp.compilers

Related articles
Parsing infix notation schairer@dfki.de (Axel Schairer) (1997-09-07)
*Re: Parsing infix notation eifrig@acm.org (Jonathan Eifrig)* (1997-09-12)**
Re: Parsing infix notation nr@viper.cs.virginia.edu (Norman Ramsey) (1997-09-12)

| List of all articles for this month |

From:	"Jonathan Eifrig" <eifrig@acm.org>
Newsgroups:	comp.lang.dylan,comp.lang.lisp,comp.compilers
Date:	12 Sep 1997 21:14:09 -0400
Organization:	PSINet
References:	<comp.lang.dylan.199708151441.PAA09067@gairsay.aiai.ed.ac.uk> <3082226718813462@naggum.no> <86lo1ecxxo.fsf@g.pet.cam.a <862035xpyn.fsf@g.pet.cam.ac.uk> <5unec0$j8j@tools.bbnplanet.com> 97-09-026
Keywords:	parse

Barry Margolin wrote:
> Gareth McCaughan <gjm11@dpmms.cam.ac.uk> wrote:
> > - infix languages are less easily parsed;
>
> This is an issue for quick and dirty code, but it's not an issue for
> implementors -- compiler technology to parse infix notation is quite
> mature.

Axel Schairer <schairer@dfki.de> wrote
> This might be true. It is true, of course, if you know all your infix
> operators when you build your parser/parse tables. But I do not know
> how to handle the situation where you
>
> - have user-defined infixes _and_
> - you want/need to use tools like bison/yacc/zebu ...

Standard ML (sort of) falls into this category: you have
user-definable "infix-able" identifiers with user-defined precedence
levels.

Actually, it's a bit worse than that, since function application
doesn't require parentheses around the argument. Thus,

ID1 ID2 ID3

can be parsed as either (ID1 ID2) ID3 (if ID2 is *not* an infix
operator) or ID2 (ID1,ID3) (if it is defined as a binary infix
operator).

The SML/NJ compiler handles this situation by not attempting to handle
the problem using a fixed set of parsing rules. Instead, sequences of
"ID-like" subexpressions (basically, expressions separated by
identifers) are glommed together into list, and fed into a separate
operator-precendence style parser that uses the current symbol table
to figure out which identifers are infix and at what precedence. This
scheme seems to work pretty well in practice.

Indeed, it seems the biggest problem with user-defined infix
identifers is coming up with some syntax for defining the parsing
behavior of the identifiers!

- Jonathan Eifrig (eifrig@acm.org)
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Parsing infix notation

"Jonathan Eifrig" <eifrig@acm.org>12 Sep 1997 21:14:09 -0400

"Jonathan Eifrig" <eifrig@acm.org>
12 Sep 1997 21:14:09 -0400