Click to See Complete Forum and Search --> : Similar Text Search


mkadayifci
January 5th, 2007, 03:22 AM
i am developing a geocode application using oracle database and c#2.0 .
But i have a big problem. I cant use "like" word for database search. Because address strings not regular always. I know soundex func. in oracle and SqlServer but it is in English. My project will work only Turkish.
Can I find an algorithm that finds similar words. Ex. "Istanbul" "Istanbol". Only character order and length maybe.
Because Turkish character set has special chars like Unicode 231(ç),Unicode 287(ğ) and more than 10.

Thanks
Mehmet KADAYIFÇI

Rigsby
March 5th, 2008, 10:42 AM
Don't have anything to hand for the Turkish character set! I'm working on a German system, and so replacing the German ö, ä, ü, ß letters with the international formats oe, ae, ue, ss before passing the text to my function.

Also, take a look at the "Daitch-Mokotoff" soundex system:

www.jewishgen.org/InfoFiles/soundex.html#DM
www.avotaynu.com/soundex.htm

It seems better for dealing with non English words