Re: Dragon Book question

rkrayhawk@aol.com (RKRayhawk)
23 Mar 2000 22:40:41 -0500

          From comp.compilers

Related articles
Dragon Book question lojedaortiz@interlink.com.ar (Nicolás) (2000-03-23)
Re: Dragon Book question rkrayhawk@aol.com (2000-03-23)
Re: Dragon Book question thp@roam-thp2.cs.ucr.edu (Tom Payne) (2000-03-23)
Re: Dragon Book question cgbatema@undergrad.math.uwaterloo.ca (2000-04-01)
| List of all articles for this month |

From: rkrayhawk@aol.com (RKRayhawk)
Newsgroups: comp.compilers
Date: 23 Mar 2000 22:40:41 -0500
Organization: AOL http://www.aol.com
References: 00-03-095
Keywords: parse

Nicol=E1s, lojedaortiz@interlink.com.ar=20
asked about
<<
... using syntax-directed diagrams=20
...
Can I assume that the actions can be "executed" in a postorder traversal
of the parse tree, instead that when I am actually parsing the source
files ?
>>


Each is valid. The parse is like a shuffle of a deck of cards. You can take
them from the shuffle one at a time, or wait until the whole deck is shuffled.


Atleast as far as the basic validity of the sequence of 'actions' you 'execute'
in response to the input in a simple translation type application, you can
apply your transforms as soon as you have gathered in each unit of work to
which a transform (an action) would apply.


In this simple translation mode the trade-offs are that you need lots of
internal storage to hold the entire set of symbols before you start taking
actions; if you can respond to some and discard them early, you might conserve
resources. Also with a store all first then take actions approach, you must
regather the pieces needed for the actions; whereas the respond now mode
basically has all the pieces in hand when you need them the first time.


This becomes a little more clear if you use the tools (please take no offence
if you are more advanced in your experience than I assume). The tools have
stack-like functioning built into them. The tools scan and push things onto a
working stack. Each pop point presents you with the set of items that you would
want to act upon right then, and you generally would not be motivated to loose
such an opportunity.


In advanced applications you can not always just detect a unit of work and 'do
it' right now. Advanced applications might be those requiring optimizations
across unit of work boundaries. So you have to hold onto things for a while,
rearrange sequences, or embellish your internally stored structure, and then
later take action.


Parser tools detect symbol sequences, usually in a manner that can cope with
recursive patterns in the input. The parser tools present you with
opportunities to take actions upon each detected pattern (at each level in the
hierarchy you depict with your grammar rules). This set of opportunities
reflects chances to exploit the syntax enforced in the parsed information
environment.=20


If your coding requirements do not exceed a set of syntax conventions, then you
can take all actions right then when your tools detect the input pattern, rule
by rule. In this simple case you can as an option also wait until the end, to
take a bunch of actions applied against a struture recorded in intermediate
form or in an AST; which would be functionally the same but you have to
re-invent the wheel that the parser tool spun for you as you re-gather the
pieces a bundle at a time.


If your coding requiremenst go beyond a set of concepts to be applied to
detected syntax, you need to hold onto your bundled units of work for a while.=20
Obviously you can wait to the end to scan the intermediate representation on
file or in an AST, or you _might_ be able to find levels between the level of
lowest unit of work and the level of the whole file to apply your requirements
to.


For example, lets just say that your lowest level was statements. If you only
had to find them and 'do' them, then you could, infact, have actions in the
grammar rules that 'do it' immediately upon detecting a valid statement.


If you were require to optimize, say (just for a simple example) elevating loop
invarant code, as in manipulating this original=20
      b =3D 0
      for 1 to 10
            a =3D 100
            b =3D b+ a
            print b
      next=20


into something like=20
      b =3D 0
      a =3D 100 * this got elevated for efficiency
      for 1 to 10
            b =3D b+ a
            print b
      next=20


Then you must retain the representation of the code for a duration
that exceeds the detection of something so simple as a statement. The
'analyzer' will have to feed on gatherings atleast as large as the for
loop.


It is a fact that optimization can be done with an AST, an AST is just
a storage method. It does store a representation of your symbols,
sequence and relationships; usually in memory (but not necessarily
so). The term 'intermediate code', I guess almost always meant in the
past a form out to a file, and a sequential file; nowadays certainly
intermediate code can be held in memory as well.


So an optimization process can be applied to a gathering of more than
one unit of work; which might be stored in core or on disk, in a
sequential form or in a hierachical form (perhaps an AST).=20


But the real point of drawing all of this out is that the requirements
that go beyound mere syntax detection do not necessarily lead to to
retaining the whole program. You just need to retain the portion that
you will analyze (for example for the purpose of optimizing).=20


You could hold onto for loops and while loops, But once optimized you
could emit the 'final' code and release the resources used to
represent the collections of units of work.


So the answer to your question is three part. Simple requirements can
lead to quick release of the internal representation unit of
work. Mild requirements just beyond syntax detection can lead to
retaining considerable units of work, but still releases could occur
once you hit the boundary of the analysis work set.


In either of these first two cases, the designer would have the
option, although not the requirement, to retain the entire program in
storage (memory or intermedite image on disk).


The third situation would be a set of requirements that had as its
scope the entire program. For example, lets say you had some kind of
file that specified for http handling when ever a *.jpg or *.gif
graphics file was to be transmitted that it had to be compressed on
the way out and decompressed on the way back in. And lets say this
would involve an internal action to bring in the necessary routines,
but they are bloated and you only want the one you need, and none at
all if not needed. Well you may have to see the whole file to know
what to do.


You can not respond to a command like
    COMPRESSION = INOUT
until the whole file has been scanned.


So in this type of situation, you might be motivated to hold the entire file.
Yet most of us can chew gum and walk at the same time, so you can find ways to
just hold onto what you need to and dispense with pieces that are really done
with when you see that they are done, and hold onto the rest until further
clarified.


So in addition to notions of AST and intermediate forms, each phase of
a compiler usually has lots of lists of things to resolve
'eventually', and notations that summarize the effect of having
processed things so far.


Generally, anything that relates to syntax can _possibly_ be done
immediately upon detection of the pattern in the parser generated by
popular tools - at rule reduction time. Things which reflect more
complex requirements, like semantic analysis of code flow or data
flow, _probably_ require you to hold on to a number of units of work
long enough to properly make the analysis.=20


Whenever you start holding on to things for a long time, you tend
towards intermediate forms other than ASTs. And indepth analysis can
require a 'postorder traversal' of the whole program stored in some
form (if I understand your phrase). But this is not necessarily the
case. Each requirement has a domain that starts at some point in the
source code, and ends somewhere. If your requirements are modest
enough; you can certainly get away with doing the analysis when you
find those boundaries (at loop boundaries, or at function boundaries,
or at paragraph boundaries) and release the resources that record the
intermediate image at that time.




Best Wishes,


Robert Rayhawk
RKRayhawk@aol.com





Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.