Re: Wrestling with phase 1 of a C compiler

luser droog <luser.droog@gmail.com>
Thu, 15 Sep 2022 20:11:15 -0700 (PDT)

          From comp.compilers

Related articles
[2 earlier articles]
Re: Wrestling with phase 1 of a C compiler luser.droog@gmail.com (luser droog) (2022-09-11)
Wrestling with phase 1 of a C compiler christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-09-12)
Re: Wrestling with phase 1 of a C compiler gah4@u.washington.edu (gah4) (2022-09-12)
Re: Wrestling with phase 1 of a C compiler christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-09-13)
Re: Wrestling with phase 1 of a C compiler luser.droog@gmail.com (luser droog) (2022-09-14)
Re: Wrestling with phase 1 of a C compiler gah4@u.washington.edu (gah4) (2022-09-14)
Re: Wrestling with phase 1 of a C compiler luser.droog@gmail.com (luser droog) (2022-09-15)
| List of all articles for this month |

From: luser droog <luser.droog@gmail.com>
Newsgroups: comp.compilers
Date: Thu, 15 Sep 2022 20:11:15 -0700 (PDT)
Organization: Compilers Central
References: 22-09-001 22-09-004 22-09-008
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="9865"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, design
Posted-Date: 20 Sep 2022 11:17:38 EDT
In-Reply-To: 22-09-008

On Thursday, September 15, 2022 at 11:17:50 AM UTC-5, luser droog wrote:
> On Monday, September 12, 2022 at 2:46:47 PM UTC-5, christoph...@compiler-resources.com wrote:


> >And, your efforts to put the
> > state behind pointers, while necessary only get you part of the way there.
> That's also true. For the present case, I'll also need to dynamically allocate
> integer objects and keep them in the environment for the calling
> function which is one of these closures, a suspension function that converts
> the first element of a stream and returns a list with a suspension in the cdr.
>
> That's the only way I'll get row and column counters to exist on a per-file
> basis.


So here's the rest of it. This ought to do the whole phase 1 of the C compilation
process while also supplementing each byte with its row and column numbers.
And it holds the state in the local environment in the closure,
although it has to create a new environment for each iteration because I don't
have a function for updating definitions (--supposed to be "functional" after all,
as much as practical).


(the header file defines the names POS_ROW, POS_COL, and POS_INPUT in an enum.)


static fSuspension force_chars_with_positions;
static list position( object item, int *row, int *col );
static parser position_grammar( void );
static fOperator new_line;




list
chars_with_positions( list input ){
    return Suspension( env( NIL_, 3,
                                                      Symbol(POS_ROW), Int( 0 ),
                                                      Symbol(POS_COL), Int( 0 ),
                                                      Symbol(POS_INPUT), input ),
force_chars_with_positions );
}


list
force_chars_with_positions( list ev ){
    list input = assoc_symbol( POS_INPUT, ev );
    integer row = assoc_symbol( POS_ROW, ev );
    integer col = assoc_symbol( POS_COL, ev );


    static parser position_parser;
    if( ! position_parser ) position_parser = position_grammar();
    object result = parse( position_parser, input );
    if( not_ok( result ) ) return rest( rest( result ) );


    object payload = rest( result );
    list pos = position( first( payload ), &row->Int.i, &col->Int.i );
    return cons( pos,
Suspension( env( NIL_, 3,
Symbol(POS_ROW), row,
Symbol(POS_COL), col,
                                                                  Symbol(POS_INPUT), rest( payload ) ),
force_chars_with_positions ) );
}


static list
position( object item, int *row, int *col ){
    if( valid( eq_int( '\n', item ) ) )
        return cons( item, cons( Int( ++ *row ), Int( *col = 0 ) ) );
    else
        return cons( item, cons( Int( *row ), Int( ++ *col ) ) );
}


static parser
position_grammar( void ){
    return either( bind( ANY( str("\r\n"),
chr('\r'),
chr('\n') ),
Operator( NIL_, new_line ) ),
                                    item() );
}


static object
new_line( list env, object input ){
    return Int('\n');
}


I think it's pretty nice and readable, while hiding some of the magic.


One big unsolved issue is how to use a GC in C without access to the
stack. My only solution is to only call the GC from the top level, or
otherwise carefully cultivating the root set and calling from near the
top level where the root set is easy to manage. But as this is off topic
here, I'd invite any thoughts about user space GC over in comp.lang.c.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.