Improved line sort
Posted
by Chad Loder
on May 13th, 1999
These macros offer several improvements over the bubblesort line sorting macro submitted by Harald Mueller.
The improvements include:
- The sorting is done off-screen in an array, which boosts the speed tremendously. Mueller's sorting implementation "bubbles" the lines directly in the file, so the bottleneck of his macro is in screen redrawing, not in his choice of algorithm.
- The algorithm used is QuickSort, which has O(n log n) complexity. Mueller uses BubbleSort, which has O(n^2) complexity.
- This implementation allows you to ignore case when sorting.
- This implementation allows you to ignore leading and trailing whitespace when sorting.
Mueller's implementation also contains a bug that occurs when you have chosen to disable editing of read-only files and you try to sort a read-only file. Since he does not build an internal representation of the lines (he sorts right in the document), the lines in the read-only file are never put in the correct order, and the sorting loop never exits.
That having been said, Mueller's implementation has some advantages over mine:
- His implementation should handle arbitrarily large selections because he does not store all the lines in memory. My implementation will fail for large selections, because it tries to store the entire selection in memory. I have verified that my implementation seems to fail with selections of around 500 lines. I don't know whether this is because of an array size limitation or whether it is dependent on the length (in bytes) of the selection.
- With Mueller's approach, one can see the bubble sorting happening on the screen, which is rather entertaining and, for intro-Comp. Sci majors, edifying.
Without further ado, here is the code. There are three routines:
- CollectLines, which takes a selection and returns a VBScript array containing all the lines of the selection.
- QuickSort, which takes an array and some bounds and sorts the array recursively.
- SortLines, which is the framework that does some verification on the document and then calls CollectLines and QuickSort, and then rewrites the sorted output.
Function CollectLines(Selection)
'-- make sure the top of the selection is really the top
StartLine = Selection.TopLine
EndLine = Selection.BottomLine
If EndLine < StartLine Then
Temp="StartLine"
StartLine="EndLine"
EndLine="Temp"
End If
Dim lines() '-- don't try to collect an empty selection
If StartLine > EndLine Then
Redim lines(0)
CollectLines = lines
Exit Function
End If
'-- collect all the lines of the selection into an array
'-- this could be prohibitive on large selections ( > 2M ? )
Redim lines(EndLine - StartLine)
For i = StartLine To EndLine > Selection.GoToLine i
Selection.SelectLine
lines(endLine - i) = Selection.Text
Next
CollectLines = lines
End Function
'-- An internal routine To sort an array
'-- Specify ignoreWhiteSpace = True To ignore leading and trailing whitespace
'-- Specify ignoreCase = True To compare strings ignoring case
Sub QuickSort(vec,loBound,hiBound,ignoreWhiteSpace,ignoreCase)
Dim pivot,loSwap,hiSwap,temp
'-- This procedure is adapted from the algorithm given in:
'-- Data Abstractions & Structures using C++ by
'-- Mark Headington and David Riley, pg. 586
'-- two items To sort
If hiBound - loBound = 1 Then
If vec(loBound) > vec(hiBound) Then
temp=vec(loBound)
vec(loBound) = vec(hiBound)
vec(hiBound) = temp
End If
End If
'-- three or more items To sort
pivot = vec(int((loBound + hiBound) / 2))
vec(int((loBound + hiBound) / 2)) = vec(loBound)
vec(loBound) = pivot
loSwap = loBound + 1
hiSwap = hiBound
do
'-- find the correct loSwap
vecLoSwap = vec(loSwap)
If (ignoreCase = 1 And ignoreWhitespace = 1) Then
While loSwap < hiSwap and ucase(trim(vec(loSwap))) <= ucase(trim(pivot))
loSwap = loSwap + 1
wend
Elseif (ignoreCase = 1) Then
While loSwap < hiSwap and ucase(vec(loSwap)) <= ucase(pivot)
loSwap = loSwap + 1
wend
Elseif (ignoreWhiteSpace = 1) Then
While loSwap < hiSwap and trim(vec(loSwap)) <= trim(pivot)
loSwap = loSwap + 1
wend
Else
While loSwap < hiSwap and vec(loSwap) <= pivot
loSwap = loSwap + 1
wend
End If
'-- find the correct hiSwap
If (ignoreCase = 1 And ignoreWhitespace = 1) Then
While ucase(trim(vec(hiSwap))) > ucase(trim(pivot))
hiSwap = hiSwap - 1
wend
Elseif (ignoreCase = 1) Then
While ucase(vec(hiSwap)) > ucase(pivot)
hiSwap = hiSwap - 1
wend
Elseif (ignoreWhiteSpace = 1) Then
While trim(vec(hiSwap)) > trim(pivot)
hiSwap = hiSwap - 1
wend
Else
While vec(hiSwap) > pivot
hiSwap = hiSwap - 1
wend
End If
'-- swap values if out of order
If loSwap < hiSwap Then
temp = vec(loSwap)
vec(loSwap) = vec(hiSwap)
vec(hiSwap) = temp
End If
loop While loSwap < hiSwap
vec(loBound) = vec(hiSwap)
vec(hiSwap) = pivot
'-- Recursively sort the partitions
'-- if there are 2 or more items in first partitions
If loBound < (hiSwap - 1) Then
Call QuickSort(vec,loBound,hiSwap-1,ignoreWhiteSpace, ignoreCase)
End If
'-- 2 or more items in second section
If hiSwap + 1 < hibound Then
Call QuickSort(vec,hiSwap+1,hiBound, ignoreWhiteSpace, ignoreCase)
End If
End Sub
Sub SortLines
'DESCRIPTION: Sorts the selected lines
If ActiveDocument.Type <> "Text" Then
MsgBox "This macro can only be run when a text editor window is active."
Exit Sub
End If
'-- make sure the top of the selection is really the top
StartLine = ActiveDocument.Selection.TopLine
EndLine = ActiveDocument.Selection.BottomLine
If EndLine < StartLine Then
Temp = StartLine
StartLine = EndLine
EndLine = Temp
End If
'-- collect the lines of the selection into an array
lines = CollectLines(ActiveDocument.Selection)
If isnull(lines) Then
Exit Sub
End If
If (ubound(lines) <= 0) Then
'-- don't try to sort an empty selection
Exit Sub
End If
'-- sort the array
Call QuickSort (lines, lbound(lines), ubound(lines), 0, 0)
'-- select the entire original selection, then delete it
ActiveDocument.Selection.GoToLine StartLine
ActiveDocument.Selection.LineDown dsExtend, (EndLine - StartLine) + 1
ActiveDocument.Selection.Delete
'-- write the sorted lines out to the file
For i = 0 To EndLine - StartLine
ActiveDocument.Selection = lines(i)
Next
End Sub

Comments
There are no comments yet. Be the first to comment!