This article was contributed by Chad
Loder.
These macros offer several improvements over the bubblesort
line sorting macro submitted by Harald Mueller.
The improvements include:
- The sorting is done off-screen in an array, which boosts
the speed tremendously. Mueller’s sorting implementation
"bubbles" the lines directly in the file, so
the bottleneck of his macro is in screen redrawing, not
in his choice of algorithm. - The algorithm used is QuickSort, which has O(n log n)
complexity. Mueller uses BubbleSort, which has O(n^2)
complexity. - This implementation allows you to ignore case when
sorting. - This implementation allows you to ignore leading and
trailing whitespace when sorting.
Mueller’s implementation also contains a bug that occurs when
you have chosen to disable editing of read-only files and you try
to sort a read-only file. Since he does not build an internal
representation of the lines (he sorts right in the document), the
lines in the read-only file are never put in the correct order,
and the sorting loop never exits.
That having been said, Mueller’s implementation has some
advantages over mine:
- His implementation should handle arbitrarily large
selections because he does not store all the lines in
memory. My implementation will fail for large selections,
because it tries to store the entire selection in memory.
I have verified that my implementation seems to fail with
selections of around 500 lines. I don’t know whether this
is because of an array size limitation or whether it is
dependent on the length (in bytes) of the selection. - With Mueller’s approach, one can see the bubble sorting
happening on the screen, which is rather entertaining
and, for intro-Comp. Sci majors, edifying.
Without further ado, here is the code. There are three
routines:
- CollectLines, which takes a selection and returns a
VBScript array containing all the lines of the selection. - QuickSort, which takes an array and some bounds and sorts
the array recursively. - SortLines, which is the framework that does some
verification on the document and then calls CollectLines
and QuickSort, and then rewrites the sorted output.
Function CollectLines(Selection) '-- make sure the top of the selection is really the top StartLine = Selection.TopLine EndLine = Selection.BottomLine If EndLine < StartLine Then Temp="StartLine" StartLine="EndLine" EndLine="Temp" End If Dim lines() '-- don't try to collect an empty selection If StartLine > EndLine Then Redim lines(0) CollectLines = lines Exit Function End If '-- collect all the lines of the selection into an array '-- this could be prohibitive on large selections ( > 2M ? ) Redim lines(EndLine - StartLine) For i = StartLine To EndLine > Selection.GoToLine i Selection.SelectLine lines(endLine - i) = Selection.Text Next CollectLines = lines End Function '-- An internal routine To sort an array '-- Specify ignoreWhiteSpace = True To ignore leading and trailing whitespace '-- Specify ignoreCase = True To compare strings ignoring case Sub QuickSort(vec,loBound,hiBound,ignoreWhiteSpace,ignoreCase) Dim pivot,loSwap,hiSwap,temp '-- This procedure is adapted from the algorithm given in: '-- Data Abstractions & Structures using C++ by '-- Mark Headington and David Riley, pg. 586 '-- two items To sort If hiBound - loBound = 1 Then If vec(loBound) > vec(hiBound) Then temp=vec(loBound) vec(loBound) = vec(hiBound) vec(hiBound) = temp End If End If '-- three or more items To sort pivot = vec(int((loBound + hiBound) / 2)) vec(int((loBound + hiBound) / 2)) = vec(loBound) vec(loBound) = pivot loSwap = loBound + 1 hiSwap = hiBound do '-- find the correct loSwap vecLoSwap = vec(loSwap) If (ignoreCase = 1 And ignoreWhitespace = 1) Then While loSwap < hiSwap and ucase(trim(vec(loSwap))) <= ucase(trim(pivot)) loSwap = loSwap + 1 wend Elseif (ignoreCase = 1) Then While loSwap < hiSwap and ucase(vec(loSwap)) <= ucase(pivot) loSwap = loSwap + 1 wend Elseif (ignoreWhiteSpace = 1) Then While loSwap < hiSwap and trim(vec(loSwap)) <= trim(pivot) loSwap = loSwap + 1 wend Else While loSwap < hiSwap and vec(loSwap) <= pivot loSwap = loSwap + 1 wend End If '-- find the correct hiSwap If (ignoreCase = 1 And ignoreWhitespace = 1) Then While ucase(trim(vec(hiSwap))) > ucase(trim(pivot)) hiSwap = hiSwap - 1 wend Elseif (ignoreCase = 1) Then While ucase(vec(hiSwap)) > ucase(pivot) hiSwap = hiSwap - 1 wend Elseif (ignoreWhiteSpace = 1) Then While trim(vec(hiSwap)) > trim(pivot) hiSwap = hiSwap - 1 wend Else While vec(hiSwap) > pivot hiSwap = hiSwap - 1 wend End If '-- swap values if out of order If loSwap < hiSwap Then temp = vec(loSwap) vec(loSwap) = vec(hiSwap) vec(hiSwap) = temp End If loop While loSwap < hiSwap vec(loBound) = vec(hiSwap) vec(hiSwap) = pivot '-- Recursively sort the partitions '-- if there are 2 or more items in first partitions If loBound < (hiSwap - 1) Then Call QuickSort(vec,loBound,hiSwap-1,ignoreWhiteSpace, ignoreCase) End If '-- 2 or more items in second section If hiSwap + 1 < hibound Then Call QuickSort(vec,hiSwap+1,hiBound, ignoreWhiteSpace, ignoreCase) End If End Sub Sub SortLines 'DESCRIPTION: Sorts the selected lines If ActiveDocument.Type <> "Text" Then MsgBox "This macro can only be run when a text editor window is active." Exit Sub End If '-- make sure the top of the selection is really the top StartLine = ActiveDocument.Selection.TopLine EndLine = ActiveDocument.Selection.BottomLine If EndLine < StartLine Then Temp = StartLine StartLine = EndLine EndLine = Temp End If '-- collect the lines of the selection into an array lines = CollectLines(ActiveDocument.Selection) If isnull(lines) Then Exit Sub End If If (ubound(lines) <= 0) Then '-- don't try to sort an empty selection Exit Sub End If '-- sort the array Call QuickSort (lines, lbound(lines), ubound(lines), 0, 0) '-- select the entire original selection, then delete it ActiveDocument.Selection.GoToLine StartLine ActiveDocument.Selection.LineDown dsExtend, (EndLine - StartLine) + 1 ActiveDocument.Selection.Delete '-- write the sorted lines out to the file For i = 0 To EndLine - StartLine ActiveDocument.Selection = lines(i) Next End Sub