Re: Test Generation Via Randomized Macro Expansion (Hermann Oliveira Rodrigues)
15 May 2000 23:51:46 -0400

          From comp.compilers

Related articles
Test Generation Via Randomized Macro Expansion (Ronald F. Guilmette) (2000-05-12)
Re: Test Generation Via Randomized Macro Expansion (2000-05-15)
| List of all articles for this month |

From: (Hermann Oliveira Rodrigues)
Newsgroups: comp.compilers
Date: 15 May 2000 23:51:46 -0400
Organization: POP-MG
References: 00-05-055
Keywords: macros, comment

Ronald F. Guilmette ( wrote:
: A few years ago, I implemented a specialized system (TGGS) whose purpose
: was to generate randomized but syntatically valid test cases for compilers
: of various languages.

: It occured to me the other day that this sort of generation of (randomized)
: text obeying various context-free syntax rules could be construed as just
: an exercise in randomized macro expansion.

: Given this realization, I set out to create an enhanced version of the
: GNU m4 macro processor which would have in-built capabilities for
: performing randomized macro expansion. My goal was to create a
: version of GNU m4 which would make it easy to use m4 to generate
: randomized texts obeying various specified context-free syntax rules.
: I believe that I succeded. The results of my little effort are
: described below.

Some time ago, an instructor gave us an interesting challenge:
Implement a toy compiler for a toy language (called PLM2) only using
the m4 macro processor. The toy language had common features like,
if-then-else-endif, while-do-whileend and attribution. The compiler
should be able to identify constants and variables and allocate the
required memory for it in the assembly output file. Variable did not
need to be declared and the compiler should identify the first time
that a variable appear in the program.

The compiler generates a text file consisting of an assembly
description for the target machine, so an assembler could process it
pretty easy.

A toy example program follows:

// Program in PLM2 to multiply positive integers. //

read x;
read y;

// Test for input validation.
test1 is x <= 0;
test2 is y <= 0;
if test1 or test2 then
// The default output.
print 0;
// Input is valid.

// Do the multiplication.
par1 is x;
par2 is y;
call multiply;
print result;

// Multiplication procedure definition.
procedure multiply as
result is 0;
while par2 > 0 do
result is result + par1;
par2 is par2 - 1;


The compiler output looks like this (the labels are in
portuguese, sorry):

SUB __var7
OUT __var0


x: DC 0
y: DC 0
result: DC 0
par1: DC 0
par2: DC 0


All the given requirements were filled except the nesting of
control structures due parsing restrictions. The statements are
recognized by m4 using regular expressions and replaced by a macro
that `knows' how to define that statement.

for example,

if x <= 2 then
print 3;
print 2;


_if_(x < 2, print 3; exit;, print 2)

But we know that regular expressions can't be used to parse nested

for example

if x <= 2 then
print 3;
if y > 4 then
print 1;
print 2;


if (x < 2 then print 3; else if y < 4, print 1;, print 2 endif)

This is not what we expected.

To keep the pattern matching under control, we add a `.' to
the structure termination and put that restriction in the pattern. So
the `expression', `then' and `else' contents can not include the `.'.
Unfortunately this killed any nested control structure detection hope.

I had a final idea to implement, in the m4, a new scanner that
could process the file, token by token, and put a suffix in the control
structures identifying their nesting level and which keywords could match
it. The keyword matching should be a `random' tag generated each time that
a new structure beginning was found.

The previous definitions should look as follow after the initial

if_0_234111233 x <= 2 then_0_234111233
print 3;
if_1_45664212 y > 4 then_1__45664212
print 1;
print 2;

This should give the m4 processor enough information about the
structures to avoid wrong matchings.

Unfortunately the m4 did not have a random number generator in
that time and our idea could not be implemented.

To finish the whole history, we lost the challenge for one
item. But the instructor understand that we found a technical
limitation and gave us the merit and the points anyway.

Now, using the random generator implemented to the m4, the
next crazy students that accept the instructor challenges can complete
the tasks better than we could.

At least, I hope. :)

        Hermann Oliveira Rodrigues -
                Computer Science Student at UFMG - Brazil
[In 1972 as an undergrad, I wrote a compiler from a subset of APL to Basic
in Trac, a macrogenerator similar to M4. It worked pretty well, tried to
be smart about not generating matrix temporary values. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.