Re: Arithmetic expressions w. recursive descent parser?

torbenm@app-2.diku.dk (=?iso-8859-1?q?Torben_=C6gidius_Mogensen?=)
28 Jan 2006 15:13:25 -0500

          From comp.compilers

Related articles
Arithmetic expressions w. recursive descent parser? bmildh@linuxmail.org (2006-01-26)
Re: Arithmetic expressions w. recursive descent parser? Juergen.Kahrs@vr-web.de (=?ISO-8859-1?Q?J=FCrgen_Kahrs?=) (2006-01-28)
Re: Arithmetic expressions w. recursive descent parser? torbenm@app-2.diku.dk (2006-01-28)
Re: Arithmetic expressions w. recursive descent parser? NOSPAMrszeflerNOSPAM@murator.com.pl (RobertSzefler) (2006-01-28)
| List of all articles for this month |

From: torbenm@app-2.diku.dk (=?iso-8859-1?q?Torben_=C6gidius_Mogensen?=)
Newsgroups: comp.compilers
Date: 28 Jan 2006 15:13:25 -0500
Organization: Department of Computer Science, University of Copenhagen
References: 06-01-078
Keywords: arithmetic, parse
Posted-Date: 28 Jan 2006 15:13:25 EST

bmildh@linuxmail.org writes:


> I'm trying to make a simple java program to parse arithmetic
> expressions. To keep it simple, only integers are used in the
> expression, together with * / + - ( ) as terminals.
> I want the parser to be a predictive recursive descent parser, with 1
> lookahead.
>
> The problem is that I cant get the implementation right with +,- in
> combination w. numbers starting w. + or -, eg. -45, +3, etc.
> I would appreciate any help, pointers, tips, on how to get the + and -
> operator correctly implemented with numbers that could start with + or
> - .
>
> Typical expressions that I want to parse:
> (12--21)
> 5+4-1-2
>
> A made a grammar but it was left-recursive, but I removed it and this
> is the resulting grammar:
>
> E -> TX | T
> X -> +TX | +T | -TX | -T
> T -> FY | F
> Y -> *FY | /FY | /F | *F
> F -> number
> F -> (E)


This doesn't seem like the correct grammar for what you describe.
Initially, we have an ambiguous grammar:


  E -> E+E | E-E | E*E | E/E | +E | -E | number | (E)


Next, we have to decide on the precedence of the operators. The usual
decision is:


  +, -: lowest priority, left associative when infix, highest priority
              when prefix.


  *, /: medium priority, left associative.


Note that + and - have different priorities when prefix or infix.


We now rewrite the grammar to get the precedences right:


  E -> E+T | E-T | T
  T -> T*F | T/F | F
  F -> +F | -F | number | (E)


and eliminate left-recursion:


  E -> TE'
  E' -> +TE' | -TE' | \epsilon
  T -> FT'
  T' -> *FT' | /FT' | \epsilon
  F -> +F | -F | number | (E)


Starting from here, you should have no problem writing a recursive
descent parser.


                Torben


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.