Re: Independent Study - Assistance?

Chris F Clark <>
Mon, 16 Feb 2009 14:10:45 -0500

          From comp.compilers

Related articles
Independent Study - Assistance? (Alexander Morou) (2009-02-15)
Re: Independent Study - Assistance? (Chris F Clark) (2009-02-16)
Re: Independent Study - Assistance? (Alexander Morou) (2009-02-21)
Re: Independent Study - Assistance? (Chris F Clark) (2009-02-27)
Re: Independent Study - Assistance? (Alexander Morou) (2009-02-28)
| List of all articles for this month |

From: Chris F Clark <>
Newsgroups: comp.compilers
Date: Mon, 16 Feb 2009 14:10:45 -0500
Organization: The World Public Access UNIX, Brookline, MA
References: 09-02-081
Keywords: practice, courses
Posted-Date: 17 Feb 2009 15:59:51 EST

Alexander Morou <> writes:

> One is how to automate state machine generation for regular
> expressions, I just haven't been able to get this part down (close,
> but not quite, certainly not the least state version, I forgot to
> consider repeat cycles -_-). I'm thinking a large part is due to my
> educational background (or lack thereof).

You don't necessarily need to get a full education (e.g. BS in CS) to
understand what you need. You can probably fill in what you need with
a few classes, even the ones taught by distance learning. I would
recommend two courses. First, a "discrete math (for CS)" course.
that should teach you the set theory, group theory, probablity theory
that you need for most CS papers. Second, after that you can take an
actual "compiler" course. Given your interest, you might want to find
one that teaches the algorithms used in "compiler compilers" rather
than one which focuses on building a complete toy compiler. You can
acquire both skills, just not in one semester.

Alternately, if you really want to do it on you own. You can pickup
just the textbooks for such courses. I don't know where there is a
list of discrete math textbooks, but there must be one, check some of
the sci.math or comp.theory FAQs. However, I do know that John has
reviews of several conmpiler textbooks in the comp.compilers FAQ. I'm
sure he'll obligingly add a nice pointer right here.

For what you seem to want to know. I would either recommend the Aho,
Hopcroft, and Ullman (Dragon) book or Holub's book, depending on
whether you want a more theoretical or more practical slant. I
supsect the former (and thus the former book) given that you didn't
want to read an ope source version and learn from that. Note, for
most people who want the build a toy compiler book, Appel's (Tiger)
book is better, but it is not as good for your intended use.

In the end, you will find that there are only a few algorithms that
you need to know:

1) How to convert an RE to an NFA. Thompson's method.

2) How to convert an NFA into a DFA. Subset contruction.

3) How to convert a grammar into a recursive descent parser. First
      and Follow Sets.

4) How to convert a grammar into a LALR parser. Dotted items.

There are more algorithms that you can extend that set with. But, if
you know and understand those 4, you have your core (for building a
lexer and/or parser generator).

As to reading papers, some are accessible and some are not. However,
quite often, they will introduce some notation in the early paper, but
only use a small fraction of the notation in the paper. Most of the
rest is simply to put the paper into a sound mathematical framework.
Half the time when reading such papers, I skip section 2 (where the
notation is usually defined) and the bodies of most of the proofs
unless the result is particularly surprising, and still understand the

Most good papers have a main important idea they want to get across
and make that clear. The rest of the paper is just to fill in enough
details so that you see the idea in context. You can't necessarily
expect to get all of the paper on the first read, but over time you
will see how the same jargon, style, and "common knowledge" gets
incorporated into them.

However, there are some authors, like the one who wrote the paper you
cited, who seem to take great joy in making their paper inaccessible
and mathematically dense, and then are careless with their notation as
if it showed their sophistication, making their rigor suspect,
especially when they produce results that appear to contradict
well-known theorems. My suggestion is that if you read a paper and it
seems to be too incomprehensible, pick another paper. Either you will
eventually pick up enough back-ground that you can read the paper you
skipped, or learn what you need from another source.

Hope this helps,

Chris Clark Internet:
Compiler Resources, Inc. or:
23 Bailey Rd Web Site:
Berlin, MA 01503 voice: (508) 435-5016
USA fax: (978) 838-0263 (24 hours)

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.