|Grammar for roman numerals email@example.com (2007-03-27)|
|Re: Grammar for roman numerals firstname.lastname@example.org (Martin Ward) (2007-03-29)|
|Re: Grammar for roman numerals email@example.com (Ivan Boldyrev) (2007-03-29)|
|Re: Grammar for roman numerals firstname.lastname@example.org (Dmitry A. Kazakov) (2007-03-30)|
|Re: Grammar for roman numerals email@example.com (Martin Ward) (2007-03-30)|
|Re: Grammar for roman numerals firstname.lastname@example.org (Dmitry A. Kazakov) (2007-04-01)|
|Re: Grammar for roman numerals DrDiettrich1@aol.com (Hans-Peter Diettrich) (2007-04-01)|
|Re: Grammar for roman numerals email@example.com (whiskey) (2007-04-06)|
|[3 later articles]|
|From:||Martin Ward <firstname.lastname@example.org>|
|Date:||29 Mar 2007 00:59:10 -0400|
|Posted-Date:||29 Mar 2007 00:59:10 EDT|
On Tuesday 27 Mar 2007 14:27, email@example.com wrote:
> Here is my grammar (I allow an arbitrary number of Ms)
> numeral -> thousands
> thousands -> thous_part hundreds | thous_part | hundreds
> thous_part -> thous_part M | M
> hundreds -> hun_part tens | hun_part | tens
> hun_part -> hun_rep | CD | D | D hun_rep | CM
> hun_rep -> C | CC | CCC
> tens -> tens_part ones | tens_part | ones
> tens_part -> tens_rep | XL | L | L tens_rep | XC
> tens_rep -> X | XX | XXX
> ones -> ones_rep | IV | V | V ones_rep | IX
> ones_rep -> I | II | III
This doesn't accept IIII for 4 (as found on many clocks with Roman
Numeral faces, for example), nor does it accept the "shorthand"
forms: IC for 99, IIC for 98, MVM for 1995 and so on.
The rule is that any smaller number placed before a larger
number is subtracted from the larger number.
I know of no examples where the "smaller number"
consists of other than a single numeral, or the two identical numerals
II, XX or CC. However, constructions such as IIIII for "five", IIX for "eight"
or VV for "ten" have been discovered in manuscripts.
A bar placed over a number multiplies it by one thousand,
and a double bar multiplies it by one million.
This could be implemented in your system by using parentheses
to denote the bar: thus (I) would represent 1,000.
(In the Middle Ages, 500, usually D, was sometimes written as
I followed by an apostrophus, resembling a backwards C, while 1,000
was written as CI followed by an apostrophus.)
The more general question raised by this discussion (and more relevant
to comp.compilers) is how "forgiving" should a parser be in the case
where the language being parsed has no formal definition: or where
there are several, conflicting formal definitions?
Do you accept anything that can possibly be interpreted,
or do you place "arbitrary" restrictions in order to simplify
the grammar, at the expense of rejecting existing files?
firstname.lastname@example.org http://www.cse.dmu.ac.uk/~mward/ Erdos number: 4
G.K.Chesterton web site: http://www.cse.dmu.ac.uk/~mward/gkc/
Return to the
Search the comp.compilers archives again.