ML-style pattern matching in C-like languages

Russ Cox <rsc@swtch.com>
15 Dec 2005 02:21:01 -0500

          From comp.compilers

Related articles
ML-style pattern matching in C-like languages rsc@swtch.com (Russ Cox) (2005-12-15)
Re: ML-style pattern matching in C-like languages just-for-news-frido@q-software-solutions.de (Friedrich Dominicus) (2005-12-15)
Re: ML-style pattern matching in C-like languages gtg983q@mail.gatech.edu (Ben Chambers) (2005-12-15)
Re: ML-style pattern matching in C-like languages torbenm@app-2.diku.dk (2005-12-15)
Re: ML-style pattern matching in C-like languages RLake@oxfam.org.uk (2005-12-15)
Re: ML-style pattern matching in C-like languages rvclayton@acm.org (2005-12-15)
Re: ML-style pattern matching in C-like languages haberg@math.su.se (2005-12-19)
[4 later articles]
| List of all articles for this month |

From: Russ Cox <rsc@swtch.com>
Newsgroups: comp.compilers
Date: 15 Dec 2005 02:21:01 -0500
Organization: Compilers Central
Keywords: question, design
Posted-Date: 15 Dec 2005 02:21:01 EST

Have there been any proposals or prototypes built
for introducing some form of ML-style pattern matching
in C-like languages?


I am working on a C compiler written in a malleable
dialect of C and have no qualms about adding new
syntax to the language to make writing the compiler
easier. I am toying with the idea of adding some form
of pattern-matching syntax but am not sure exactly
what it should look like. If these are previously-charted
waters, I figured I'd look around for maps.


One issue is that C has no explicit support for
discriminated unions, but I am willing to assume that there
are hints of some form to tell the compiler how to interpret
the patterns w.r.t. the structure elements.


Another issue is that my structures have much more
information in them than I care to match on. For example,
the generic abstract syntax node might look something like:


        struct Node
        {
                int op; /* OADD, OSUB, OMUL, ONEG, etc. */
                Node *left; /* left child in a.s. tree */
                Node *right; /* right child in a.s. tree */
                Line line; /* source line information */
                Type *type; /* C type assigned to expression */
        };


If I want to pattern match on this, sometimes I might care
about the type and sometimes not. I never care about the
source line number. Sometimes I don't even care about the
right child. In the real struct Node there are many more fields,
any of which I only rarely care about.


One might envision a syntax like:


        if(n ~ Node(OADD, Node(OMUL, ?x, ?y), ?z))


or even


        if(n ~ OADD(OMUL(?x, ?y), ?z))


to match (and bind subexpressions in) an expression of
the form x*y+z. But maybe sometimes I care about the types:


        if(n ~ OADD(OMUL(?x, ?y, .type ~ TLONG), ?z))


where the the multiplication has result type "long". And so on.


How do ML programmers who write real compilers deal
with this issue (auxiliary data that never or rarely is
interesting in the match)?


Have there been pattern-matching syntaxes in other languages
that deal with this problem?


Any and all pointers appreciated.


Thanks.
Russ Cox


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.