Re: Pascal compiler and sets

Dave Gillespie <synaptx!thymus!daveg@uunet.UU.NET>
Thu, 10 Feb 1994 22:15:45 GMT

          From comp.compilers

Related articles
Pascal compiler and sets dallison@bfsec.bt.co.uk (1994-02-08)
Re: Pascal compiler and sets davisdm@widget.msfc.nasa.gov (1994-02-08)
Re: Pascal compiler and sets mauney@adm.csc.ncsu.edu (1994-02-08)
Re: Pascal compiler and sets samiam@netcom.com (1994-02-09)
Pascal compiler and sets ssimmons@convex.com (1994-02-09)
Re: Pascal compiler and sets stenuit@axp05.acset.be (1994-02-10)
Re: Pascal compiler and sets wjw@eb.ele.tue.nl (1994-02-10)
Re: Pascal compiler and sets synaptx!thymus!daveg@uunet.UU.NET (Dave Gillespie) (1994-02-10)
| List of all articles for this month |

Newsgroups: comp.compilers
From: Dave Gillespie <synaptx!thymus!daveg@uunet.UU.NET>
Keywords: Pascal
Organization: Compilers Central
References: 94-02-051 94-02-059
Date: Thu, 10 Feb 1994 22:15:45 GMT

Dave Allison asks about implementing sets in Pascal.


When I wrote a Pascal compiler a few years ago the problem of sets seemed
daunting and mysterious, but I found a good way to think about it: Pretend
you are implementing a "string" type, but with bits instead of characters
as the components. Many Pascal dialects have a dynamically-sized string
type:


var s1 : string[100];
s1 := 'howdy, world';
if s1 = 'foo' then ...


A string[100] is typically an array of 101 bytes where the first is a
length byte.


All the problems of set allocation come up with strings, but somehow
strings seem to be easier to visualize and think about.


I know at least one commercial compiler which implements sets as vectors
of words where the first word is a length; this is also how I implement
sets larger than 32 elements in p2c. All the various set operations, like
"setcpy", "setequal", "setunion", and so on, are written in terms of
these. The length word effectively gives the position of the last
non-zero word in the bit vector; it helps with the case where the set is
large but mostly empty. (For example, a "set of char" needs space for 256
bits, but the latter 128 bits will hardly ever be used in practice.)


If you want range checking, have your "setcpy" function take the maximum
allowable size of the destination as an extra argument.


Set operators like "s1 + s2" need to allocate temporaries, but for each
operator it's easy for the compiler to determine statically a bound on the
size of the temporary. (The larger of the sizes for "+", the smaller of
the sizes for "*", etc.) In p2c, I had my code generator create extra
temporaries and then optimize them away later; this works well for both
sets and strings:


str1 := str2 + str3
          => strcpy(str1, sprintf(TEMP, "%s%s", str2, str3))
          => sprintf(str1, "%s%s", str2, str3)


set1 := set2 + set3
          => setcpy(set1, setunion(TEMP, set2, set3))
          => setunion(set1, set2, set3)


where again you might want to pass the maximum size of the destination as
an extra argument to "setunion".


Set literals are not too much trickier to deal with than string literals.
The fact that "[]" is typeless is a bit of a nuisance, but then, no more
so than the "nil" pointer literal.


-- Dave
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.