|"Near Miss" error handling? firstname.lastname@example.org (2001-03-27)|
|Sv: "Near Miss" error handling? email@example.com (srs srs) (2001-03-31)|
|Re: "Near Miss" error handling? firstname.lastname@example.org (Joachim Durchholz) (2001-04-04)|
|Re: "Near Miss" error handling? email@example.com (Barry Watson) (2001-04-04)|
|Sv: "Near Miss" error handling? firstname.lastname@example.org (srs srs) (2001-04-10)|
|Re: "Near Miss" error handling? email@example.com (2001-04-14)|
|Re: Sv: "Near Miss" error handling? firstname.lastname@example.org (Ben Pfaff) (2001-04-15)|
|From:||"srs srs" <email@example.com>|
|Date:||31 Mar 2001 02:38:19 -0500|
|Posted-Date:||31 Mar 2001 02:38:19 EST|
You could probably use the Soundex algorithm or a variant. I've seen a few,
but the one Knuth use is as follows (see The Art of Computer Programming vol
3, p 394):
1. Retain the first letter of the name, and drop all occurences of
a,e,h,i,o,u,w,y in other positions
2. Assign the following numbers to the remaining letters after the first:
b,f,p,v -> 1
c,g,j,k,q,s,x,z -> 2
d,t -> 3
l -> 4
m,n -> 5
r -> 6
3. If two or more letters with the same code were adjacent in the original
name (before step 1), or adjacent except for intervening h's and w's, omit
all but the first.
4. Convert to the form "letter,digit,digit,digit" by adding trailing zeros
(if there are less than three digits), or by dropping rightmost digits (if
there are more than three).
For example, the names Euler, Gauss, Hilbert, Knuth, Lloyd, Lukasiewicz and
Wachs have the respective codes E460, G200, H416, K530, L300, L222, W200.
This algorithm (which is patented by the way; in 1922!) could be adjusted to
make it work with identifiers (which, typically, contain letters, numbers
and underscores.) You could, as a start, simply remove numbers and
underscores before applying Soundex.
- Stein Roger
"Gwyn Judd" <firstname.lastname@example.org> skrev i en meddelelse
> I'm writing (modifying actually) a compiler for my final undergraduate
> project and I've come across a feature I've never seen in a production
> compiler. basically when the compiler comes across an identifier it
> hasn't seen before, it will go through the list of known identifiers
> and try to determine which is the closest so it can then make a
> hopefully helpful suggestion on how to correct the error.
Return to the
Search the comp.compilers archives again.