Click to See Complete Forum and Search --> : Spelling correction algorithm


BarefootGuy
April 17th, 2009, 12:41 PM
I need an algorithm to provide spelling suggestions for a mistyped word by finding the closest matched words from the dictionary. It seems that there is a widely usd algorithm called "agrep" for approximate string matching

http://www.cs.sunysb.edu/~algorith/implement/agrep/implement.shtml

but I'm not able to get the source code of it. Can someone please help?

BarefootGuy
April 17th, 2009, 12:49 PM
And the algorithm should not rely on sounds/phonetic.

BarefootGuy
April 17th, 2009, 02:22 PM
Can anyone say how the Lucene n-grams algorithm works? All I could find is that it creates n-grams for each dictionary words and indexes them, but no information could be found on how the search is actually performed.

tridentblack
April 24th, 2009, 09:50 PM
Its kind of cryptic, isn't it? I'm reading this stuff and not finding easy explanations either.

But it seems like you could do something. Like what if built a Markov model for letters of words in your dictionary, And used it to search for close words based on the primitive operators for calculating Levenshtein distance?
http://en.wikipedia.org/wiki/Fuzzy_string_searching
So your algorithm would first look at the most likely

insertion (to result in most probable string)
deletion (to result in most probable string)
substitution (to result in most probable string)

And see if these are in the dictionary, and when it finds 10 or so, stops searching. The idea of A* (a-star) search pops to mind with this too.

Some thoughts anyway.... (shrugs)