Re: Question about GOTO syntax checking

rkrayhawk@aol.com (RKRayhawk)
14 Jul 1999 11:48:43 -0400

          From comp.compilers

Related articles
Question about GOTO syntax checking magnus.jansson@mbox319.swipnet.se (Magnus Jansson) (1999-07-14)
Re: Question about GOTO syntax checking rkrayhawk@aol.com (1999-07-14)
Re: Question about GOTO syntax checking thiadmer@compuphase.com (1999-07-19)
Re: Question about GOTO syntax checking cfc@world.std.com (Chris F Clark) (1999-07-19)
Re: Question about GOTO syntax checking jonathan_barker@my-deja.com (1999-07-20)
| List of all articles for this month |

From: rkrayhawk@aol.com (RKRayhawk)
Newsgroups: comp.compilers
Date: 14 Jul 1999 11:48:43 -0400
Organization: AOL http://www.aol.com
References: 99-07-053
Keywords: syntax, errors

"Magnus Jansson" <magnus.jansson@mbox319.swipnet.se>
Date: 14 Jul 1999 02:08:38 -0400


posted a code hypothet
<<
program....
begin
....
goto l1;
if a>0 then begin
....
L1:
....
end;
...
>>
And then flatly stated
<<
This is incorrect, ...
>>


And the gist of the post is a question of how to detect it?


Well, ... the overall question needs to be addressed, but ... let me
delay a bit. The posted GOTO is not _necessarily_ wrong.


GOTOs are offered as a programming instrument to be used with
caution. It is up to the coder to take responsibility for the
consequences. My notion here is that GOTOs are demanded by
programmers. We could eliminate them, trying to get the world to
accept the notion that they are bad, or 'old stuff' or 'too
dangerous', insisting that folks use purely structural code
everywhere.


This would not be accepted. So you open the flood gate and
contemplate how to prevent the flood.


In a simple case, and I know that the code snippet was illustrative
and intended as brief, but anyway, in the simple case a GOTO into a
subjunct that had only the IF/BEGIN/END might not be harmful at all,
as long as the total code really represented what the programmer
wanted, ... so skip the if and dive into the middle, ... really, so
what?


In the next order of complications, say you had an IF
  THEN BEGIN
  END
ELSE
  BEGIN
  END.


Now the leap into the THEN section has no completely clear
meaning. What happens at the first END, do we then go on to hit the
ELSE somehow, and what does the ELSE _mean_ when there has been no
proper entry thru the IF?


A language has to define it. I have seen compilers on PCs that allow
this and simply permit execution to drift right through the END/ELSE
barrier. Disconcerting at first, but I am not sure that there is any
universal rule that that is actually wrong. Such compilers also have a
tendency to allow random ELSE clauses that had no matching IF, a true
error it seems ... but COBOL has this kind of problem with EXIT, a do
nothing, that many programmers have found a way to drift through
... EXITs are not matched to THRU clause or anything. In COBOL it is
not an error to drift through an EXIT, it is defined as a do nothing.
You are supposed to drift through, if that is what you coded.
Although most folks feel that you should not code that. 'cases' in C
switch statement, which lack 'breaks' involve similar issues. These
things are actually langauge features.


With the question at hand, in the absence of a standard or a clear
definition of a language, an implementer could put a branch around the
ELSE to the final concluding END (which certainly could be a difficult
thing to scope out at code generation time for a seriously nested
IF/ELSE structure). This would make branches into the interior of
IF/THEN clauses more workable.


But if you insist that the branch into the THEN clause is not good,
then the several ways out are, 1) let the programmer do it, and have
them take responsibility for it.


Or, 2) (obtusely) outlaw the labels within the THEN and ELSE clauses.


Or 3) create a kind of scoping, that makes the labels interior to the
THEN and ELSE clauses invisible to the outer reference attempts (from
above or below). here is an analogy by means of data items:


  int i; /* original i is visible */
  i=2;
  {int i; /* we have a new i */
  i=777;
  }
  i=i*i; /* i is equal to 4 now */


A potential need to do this kind of thing is readily apparent with
data items in modern languages that attempt to match the flexibility
of C. But do we see the relevance, so easily to labels; have a look
at the following ...


  goto l1;
l1:
  goto l1;
{
  goto l1;
l1:
  goto l1;
}
  goto l1;




The meaning of these have to be defined, not assumed. Take the exact
same topology, but make the deeper block subjunctive


  goto l1;
l1:
  goto l1;
if (you_follow_me_so_far_) then BEGIN
{
  goto l1;
l1:
  goto l1;
} END
  goto l1;


You have to define these things. Are they legal? What is the semantics?


So any how, if you are enclined to define them out rather than in,
then you either error them or you warn them. And so, your question,
how to detect them.


(continuing the sequence from above, the next 'solution' is) 4). Scope labels
the same way you scope data names.
    - recognize BEGIN and END block delimitters,
    - establish unique identifier for the block,
    - establish a level for the block
    - bring the label into scope during parsing at the point it occurs, and drop
it's scope at the end of the block or when it is superceded by a similar name
(whichever comes first).
    - the occurence of the label instantiates it as a label if it does not
already exist due to a forward reference.
    - a forward reference to a label can instantiate the label if it is not
already instantiated (forward references can also fail to ever find a label in
their scope, that is an instantiated label that has no occurence).
    - a reference to a label that is currently instantiated and has already
occured can not be a forward reference.


So you get all that done and then just use the block numbering scheme or the
block level scheme to warn or error out references you find offensive.


BEGIN; /*block 1 level 1 */
  GOTO p1; /* okay if you permit forward refs*/
  p1;
  GOTO p1; /* everyone loves it */
  GOTO p2; /* another level 1 reference ! */
  GOTO p3; /* ut-oh level 2 reference */
    BEGIN; /*block 2 level 2 */
        p3;
    END;
    BEGIN; /*block 3 level 2 */
        p4;
        GOTO p1; /* seems evil, you decide. */
                                          /*level 1 is visible in level 2 */
        p2; /* this is a distraction */
    END;
p2; /* the real p2 */
  GOTO p3; /* ut-oh again level 2 reference */
END;


Because most languages allow forward references, you may not be able
to check out the level immediately upon encountering the
reference. For example the reference to p2 above, could produce any
number of results, depending upon your definitions; If the interior p2
is illegal, then the user could not possibly mean that, so you could
not warn upon encountering the forward reference which should quitely
and happily be reolved upon parsing the subsequent p2 out at level 1.
So if you have that policy then the 'forward reference to p3' is not a
forward reference to the interior p3, it is, in fact, an illegal
reference because it is not resolvable.


So if you can map this kind of plan on normal code blocks, the
application of the ideas to the subjuctive clauses is just a special
case; for which you can apply similar rules or different rules. But
the notion of block IDs and block level numbers should help.


Generally you can adopt the policy that a label reference can not
'jump' into an interior level block (or just can't 'jump' into a
interior level _subjunctive_ block).


In a language like C, curly brackets raise and lower block edges, but
so do certain key words like 'if'. Generally, parentheses do not
establish a block edge, ... but ... the parens around the for(,,)
control material is certainly intriguing.


Yet if you undertake to outlaw references into subjunctive blocks, you
may have your work cut out for you




/* we are at level 1 */
.....
if(...) THEN BEGIN /* we are at level 2 */
  { /* this is subjuctive */
  label1:
  x=y;
    /* however */
    {/* this is not-subjuctive */ /* level 3 */
    z=0;
    label2: /* ahhhhhhhh ! */
  GOTO label1; /* first ref to label1 */
  } /* end interior non-subjuncive block */
}END /* end IF */


GOTO label2;
GOTO label1; /* second ref to label1 */


Is the reference to label2 legal because its containing block is
itself not subjunctive? Is the first reference to label1 illegal
because its containg block is itself subjunctive (just as the
originally posted enquiry implies that the second reference to label1
is illegal).


I think the real question is reference to points within the interior
of lower level blocks. You can probably manage that with an attribute
in the symbol table, and you probably already need to do that for
datanames. Although you can have different rules for dataname
references, and label references, the infrastructure is the same.


Hope that helps,


Robert Rayhawk
RKRayhawk@aol.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.