Introduction
Spelling and grammar are a headache, especially for a bi-lingual person such as me. I am quite good with spelling, but grammar not that much. Today, you will learn how to create your own spelling checker.
Our Project
The project is nothing spectacular. There are simply two buttons on the form, as shown in Figure 1:
Figure 1: Our Design
Add a new Class and name it ‘Spell.’ Add the following Namespaces to the Spell class:
Imports System.IO Imports System.Text.RegularExpressions
The System.IO namespace is responsible for reading and writing the files. You will read a file containing some text a bit later in this example. The System.Text.RegularExpressions namespace allows you to manipulate words and phrases.
Add the following fields to the class:
Private strLetters As String = "abcdefghijklmnopqrstuvwxyz" Private regWord As New Regex("[a-z]+", RegexOptions.Compiled) Private dicDictionary As New Dictionary(Of [String], Integer)()
strLetters contains all the alphabetic letters that we will work with. The regWord object will be used to identify the words that need to be checked for spelling and manipulate them. dicDictionary contains the correct words to be checked against. Add the Constructor:
Public Sub New() Dim strFile As String = File.ReadAllText("Dictionary.txt") Dim lstWords As List(Of String) = _ strFile.Split(New String() {vbLf}, _ StringSplitOptions.RemoveEmptyEntries).ToList() For Each strWord As String In lstWords Dim strTrim As String = strWord.Trim().ToLower() If regWord.IsMatch(strTrim) Then If dicDictionary.ContainsKey(strTrim) Then dicDictionary(strTrim) += 1 Else dicDictionary.Add(strTrim, 1) End If End If Next End Sub
When the Spell class is instantiated, the Dictionary.txt file gets read and formatted and the contents get stored inside the dicDictionary object. Add the Prepare Function:
Private Function Prepare(strWord As String) As List(Of String) Dim lstParts = New List(Of Tuple(Of String, String))() Dim lstInverts = New List(Of String)() Dim lstDeletions = New List(Of String)() Dim lstReplaces = New List(Of String)() Dim lstInsertions = New List(Of String)() ' Splits ' For p As Integer = 0 To strWord.Length - 1 Dim tParts = New Tuple(Of String, String) (strWord.Substring(0, p), strWord.Substring(p)) lstParts.Add(tParts) Next ' Transposes ' For i As Integer = 0 To lstParts.Count - 1 Dim strOne As String = lstParts(i).Item1 Dim strTwo As String = lstParts(i).Item2 If strTwo.Length > 1 Then lstInverts.Add(strOne + strTwo(1) + strTwo(0) & _ strTwo.Substring(2)) End If Next ' Deletes ' For d As Integer = 0 To lstParts.Count - 1 Dim strOne As String = lstParts(d).Item1 Dim strTwo As String = lstParts(d).Item2 If Not String.IsNullOrEmpty(strTwo) Then lstDeletions.Add(strOne & strTwo.Substring(1)) End If Next ' Replaces ' For r As Integer = 0 To lstParts.Count - 1 Dim strOne As String = lstParts(r).Item1 Dim strTwo As String = lstParts(r).Item2 If Not String.IsNullOrEmpty(strTwo) Then For Each letter As Char In strLetters lstReplaces.Add((strOne & letter) + _ strTwo.Substring(1)) Next End If Next ' Inserts ' For i As Integer = 0 To lstParts.Count - 1 Dim strOne As String = lstParts(i).Item1 Dim strTwo As String = lstParts(i).Item2 For Each letter As Char In strLetters lstInsertions.Add(Convert.ToString (strOne & letter) & strTwo) Next Next Return lstDeletions.Union(lstInverts) .Union(lstReplaces).Union(lstInsertions).ToList() End Function
The Prepare Function identifies the given word’s broken pieces. These pieces could be a misplaced letter, a missing letter, or an accidentally swapped letter. Once all the options have been considered and processed, it returns a list containing possible words that could match the wrongly spelled word. Add the final Function:
Public Function Fix(strWord As String) As String If String.IsNullOrEmpty(strWord) Then Return strWord End If strWord = strWord.ToLower() If dicDictionary.ContainsKey(strWord) Then Return strWord End If Dim lstWords As List(Of [String]) = Prepare(strWord) Dim dicPotential As New Dictionary(Of String, Integer)() For Each strVariation As String In lstWords If dicDictionary.ContainsKey(strVariation) AndAlso _ Not dicPotential.ContainsKey(strVariation) Then dicPotential.Add(strVariation, dicDictionary(strVariation)) End If Next If dicPotential.Count > 0 Then Return dicPotential.OrderByDescending _ (Function(x) x.Value).First().Key End If For Each sWord As String In lstWords For Each strVariation As String In Prepare(sWord) If dicDictionary.ContainsKey(strVariation) AndAlso _ Not dicPotential.ContainsKey(strVariation) Then dicPotential.Add(strVariation, dicDictionary(strVariation)) End If Next Next Return If((dicPotential.Count > 0), _ dicPotential.OrderByDescending(Function(x) x.Value) _ .First().Key, strWord) End Function
The Fix Function takes the received word and compares it with a correct similar word.
Add the following code behind the Button on Form1 labeled ‘Words:’
Private Sub Button1_Click(sender As Object, e As EventArgs)_ Handles Button1.Click Dim sEngine As New Spell() Dim strWord As String = "" strWord = "spellling" Console.WriteLine("{0} => {1}", strWord, _ sEngine.Fix(strWord)) strWord = "Hannes" Console.WriteLine("{0} => {1}", strWord, _ sEngine.Fix(strWord)) strWord = "supposidly" Console.WriteLine("{0} => {1}", strWord, _ sEngine.Fix(strWord)) Console.Read() End Sub
The preceding code is quite self-explanatory.
Add the following code behind the remaining button labeled ‘Sentences:’
Private Sub Button2_Click(sender As Object, e As EventArgs) _ Handles Button2.Click Dim sEngine As New Spell() Dim strSentence As String = "I cann niot spel niceely" Dim strCorrection As String = "" For Each wWord As String In strSentence.Split(" "c) strCorrection += " " + sEngine.Fix(wWord) Next Console.WriteLine(Convert.ToString("Did you mean:") _ & strCorrection) Console.Read() End Sub
This code looks similar to the previous code except for the part that includes a loop. The For Each loop loops through each word in the given sentence.
Download the Code
You can download the code to accompany this article from the link below.
Conclusion
As you can see, creating a spell checker is quite simple. All you need is a bit of logic.