RE: Who first invented dotted-item notation?

Christopher F Clark <christopher.f.clark@compiler-resources.com>
Mon, 23 May 2022 13:23:41 +0300

          From comp.compilers

Related articles
Who first invented dotted-item notation? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-05-22)
Re: Who first invented dotted-item notation? 480-992-1380@kylheku.com (Kaz Kylheku) (2022-05-22)
RE: Who first invented dotted-item notation? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-05-23)
RE: Who first invented dotted-item notation? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-05-24)
| List of all articles for this month |

From: Christopher F Clark <christopher.f.clark@compiler-resources.com>
Newsgroups: comp.compilers
Date: Mon, 23 May 2022 13:23:41 +0300
Organization: Compilers Central
References: 22-05-046 22-05-047
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="22129"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, history, comment
Posted-Date: 23 May 2022 10:39:10 EDT

Let me explain a little further.


Dotted-items are useful in translating (and visualizing) state
machines as translations of regular expressions or grammar rules.


In Glushkov's construction one takes a regular expression /ab*(cb)+a/
and makes copies of it with a dot before each "symbol" (character you
can recognize) and one at the end. Thus, 6 dotted-items:
/.ab*(cb)+a/
/a.b*(cb)+a/
/ab*(.cb)+a/
/ab*(c.b)+a/
/ab*(cb)+.a/
/ab*(cb)+a./


Each state in a Glushkov NFA is a subset of the powerset of these
items: (see below)


state 0:
/.ab*(cb)+a/


state 1:
/a.b*(cb)+a/
/ab*(.cb)+a/


state 2:
/ab*(c.b)+a/


state 3:
/ab*(.cb)+a/
/ab*(cb)+.a/


state 4:
/ab*(cb)+a./


The algorithm explains how to calculate what the states and their
transitions are.


Yielding something like this:


state 0:
/.ab*(cb)+a/ -> state 1


state 1:
/a.b*(cb)+a/ -> state 1
/ab*(.cb)+a/ -> state 2


state 2:
/ab*(c.b)+a/ -> state 3


state 3:
/ab*(.cb)+a/ -> state 2
/ab*(cb)+.a/ -> state 4


state 4:
/ab*(cb)+a./ -> accept


Below: Actually, in a Gluskov NFA the 6 dotted-items might be the
states, and those subsets might be how you change it to a DFA (the
subset construction algorithm). In my mind, those things have gotten
merged, but that might be an error on my part.


Anyway, when building the LR(0) machine, you build similar-dotted
items (and subsets) to define the states of the LR(0) DFA. And, the
Earley construction adds some context information to the doffed-items.


It's how I think about state machines. It's how they are documented
in Yacc++ with the debug option. Yacc (and Bison) does something
similar using _ instead of dot if I recall correctly.


But, someone must have introduced the idea first. My question is who?
  Did Backus use dotted-items? Or did Knuth or Glushkov invent them?
Or maybe Naur. Or someone else, maybe one unfamiliar to me? Maybe it
goes all the way back to Kleene. Is the question lost to time?


That's my question....


Again, thanks,
Chris


--
******************************************************************************
Chris Clark email: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
------------------------------------------------------------------------------
[BNF was descriptive, don't think Backus ever did anything with formal parsers.
I looked at Knuth's 1965 LR parsing paper which has a lot of notation but no
dots. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.