Re: Editing/storing syntax trees (Joseph H Allen)
Tue, 27 Jun 1995 14:53:59 GMT

          From comp.compilers

Related articles
[3 earlier articles]
Re: Editing/storing syntax trees (1995-06-23)
Re: Editing/storing syntax trees (1995-06-23)
Re: Editing/storing syntax trees (1995-06-23)
Re: Editing/storing syntax trees (Frode Odegard) (1995-06-24)
Re: Editing/storing syntax trees (1995-06-24)
Re: Editing/storing syntax trees (1995-06-24)
Re: Editing/storing syntax trees (1995-06-27)
Re: Editing/storing syntax trees (1995-06-27)
Re: Editing/storing syntax trees (Stefan Monnier) (1995-06-27)
Re: Editing/storing syntax trees (Stefan Monnier) (1995-06-27)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Joseph H Allen)
Keywords: syntax, design
Organization: The World Public Access UNIX, Brookline, MA
References: 95-04-013 95-06-025
Date: Tue, 27 Jun 1995 14:53:59 GMT
Status: RO

I once made a tiny computer algebra system which had an interesting equation
editor. As you typed in an equation, the editor would show both the fortran
formatted equation in ASCII and a possibly partially typesetted math
equivalent. So as you entered an equation you would might get:

fortran: 3*sqrt(_

math: ______
                    3 \ / ?

fortran: 3*sqrt(5/x)_
                                / 5
math: 3 \ / -----
                          \/ x

It would parse the entered fortran into a parse-tree and then "unparse" it
into math.

Once you had entered an equation, you could edit it in a number of ways.
You could do simple symbolic replacement, like replace each 'x' with
sin(5*y). Also, you could replace the entire equation, run the simplifier
on it or have it factored.

Neatest of all was being able to select part of the math equation with the
mouse. It worked like this- when you receive a mouse click event, the
editor would unparse the tree into math again. This involves a recursive
proceedure which preevaluates a bounding box for each screen element for
each sub-tree. When a bounding box for a particular subtree is found with
the mouse position in it, that subtree is repainted in a different color to
indicate that it is selected. Also, a pointer to the pointer in the
parse-tree which refered to the selected part is stored so that part of the
tree can be replaced.

Once you have a part of the equation selected, you could do any of the edit
operations on just that selected part.

Anyway, an editor for programming languages could be made using similar
principles. Your program could be displayed in fortran, lisp, forth,
mathematics notation and a graphical tree or even flow-chart all at the same
time. Whichever happens to be the most convenient form each part of your
program could be used the actual editing. Alternatively, typesetting
annotations could be stored along with the code so that only the most
natural format for each part of the program is displayed.

In article 95-06-025, Henry Baker <> wrote:
> (Preston Briggs) wrote:

>> > ... Wouldn't it be so much easier to store your source as a
>> > ... syntax-tree ?

>> >In theory I agree, ...

>> I disagree. ASCII source is actually a fine representation, small and
>> convenient. Syntax trees, on disk, are bulky and inconvenient.

>ASCII is a terrible representation, because it has to be reparsed on
>every reading.

This is not that big of a deal- modern parsers are very fast. Just look at
Turbo-Pascal. Also ASCII has the advantage of being completely free-form-
any editor will work on it. I know from the experience of editing
bazillions of proprietary database file formats that this is a desirable
property. Also what happens when your proprietary parse-tree source gets
slightly corrupted from a disk failure or system crash?

>> If you want general access to a syntax tree, say for use by several
>> different tools, write a single scanner-parser combination that builds
>> a tree form from the source form; but please don't multiply your editors.

>Yes, this is usually called 'Lisp'. :-)

I've found that indented ASCII to be a useful intermediate file format for
parse trees. It's much easier to comprehend than Lisp, Forth or functional
notation. For example, suppose you have:

      fact(x) {
          if(x==1) return 1;
          else return x*fact(x-1);

In Lisp this might be:
(defun fact (x) (ifelse (== x 1) (return 1) (return (* x (call fact (- x 1))))))

In indented ASCII it would look like:

defun fact
ifelse == x
return 1
return * x
call - x

The arguments or body of each list are indented, but all but the first
argument are on a different line. It is just as easy to parse as lisp or
forth, but I find it much easier to read and edit.
/* ( */ /* Joseph H. Allen */

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.