Improved line sort

This article was contributed by Chad Loder.

These macros offer several improvements over the bubblesort line sorting macro submitted by Harald Mueller.

The improvements include:

  1. The sorting is done off-screen in an array, which boosts the speed tremendously. Mueller's sorting implementation "bubbles" the lines directly in the file, so the bottleneck of his macro is in screen redrawing, not in his choice of algorithm.
  2. The algorithm used is QuickSort, which has O(n log n) complexity. Mueller uses BubbleSort, which has O(n^2) complexity.
  3. This implementation allows you to ignore case when sorting.
  4. This implementation allows you to ignore leading and trailing whitespace when sorting.

Mueller's implementation also contains a bug that occurs when you have chosen to disable editing of read-only files and you try to sort a read-only file. Since he does not build an internal representation of the lines (he sorts right in the document), the lines in the read-only file are never put in the correct order, and the sorting loop never exits.

That having been said, Mueller's implementation has some advantages over mine:

  1. His implementation should handle arbitrarily large selections because he does not store all the lines in memory. My implementation will fail for large selections, because it tries to store the entire selection in memory. I have verified that my implementation seems to fail with selections of around 500 lines. I don't know whether this is because of an array size limitation or whether it is dependent on the length (in bytes) of the selection.
  2. With Mueller's approach, one can see the bubble sorting happening on the screen, which is rather entertaining and, for intro-Comp. Sci majors, edifying.

Without further ado, here is the code. There are three routines:

  • CollectLines, which takes a selection and returns a VBScript array containing all the lines of the selection.
  • QuickSort, which takes an array and some bounds and sorts the array recursively.
  • SortLines, which is the framework that does some verification on the document and then calls CollectLines and QuickSort, and then rewrites the sorted output.

Function CollectLines(Selection)
 '-- make sure the top of the selection is really the top
 StartLine = Selection.TopLine
 EndLine = Selection.BottomLine
 If EndLine < StartLine Then
    Temp="StartLine" 
    StartLine="EndLine" 
    EndLine="Temp" 
 End If 
 Dim lines() '-- don't try to collect an empty selection 
 If StartLine > EndLine Then
   Redim lines(0)
   CollectLines = lines
   Exit Function
 End If

 '-- collect all the lines of the selection into an array
 '-- this could be prohibitive on large selections ( > 2M ? )
 Redim lines(EndLine - StartLine)
 For i = StartLine To EndLine > Selection.GoToLine i
 Selection.SelectLine
 lines(endLine - i) = Selection.Text
 Next

 CollectLines = lines
End Function




'-- An internal routine To sort an array
'-- Specify ignoreWhiteSpace = True To ignore leading and trailing whitespace
'-- Specify ignoreCase = True To compare strings ignoring case

Sub QuickSort(vec,loBound,hiBound,ignoreWhiteSpace,ignoreCase)
 Dim pivot,loSwap,hiSwap,temp

 '-- This procedure is adapted from the algorithm given in:
 '--    Data Abstractions & Structures using C++ by
 '--    Mark Headington and David Riley, pg. 586

 '-- two items To sort
 If hiBound - loBound = 1 Then
    If vec(loBound) > vec(hiBound) Then
       temp=vec(loBound)
       vec(loBound) = vec(hiBound)
       vec(hiBound) = temp
    End If
 End If

 '-- three or more items To sort
 pivot = vec(int((loBound + hiBound) / 2))
 vec(int((loBound + hiBound) / 2)) = vec(loBound)
 vec(loBound) = pivot
 loSwap = loBound + 1
 hiSwap = hiBound

 do
 '-- find the correct loSwap
 vecLoSwap = vec(loSwap)
 If (ignoreCase = 1 And ignoreWhitespace = 1) Then
    While loSwap < hiSwap and ucase(trim(vec(loSwap))) <= ucase(trim(pivot))
        loSwap = loSwap + 1
    wend
 Elseif (ignoreCase = 1) Then
    While loSwap < hiSwap and ucase(vec(loSwap)) <= ucase(pivot)
        loSwap = loSwap + 1
    wend
 Elseif (ignoreWhiteSpace = 1) Then
    While loSwap < hiSwap and trim(vec(loSwap)) <= trim(pivot)
         loSwap = loSwap + 1
    wend
 Else
    While loSwap < hiSwap and vec(loSwap) <= pivot
         loSwap = loSwap + 1
    wend
 End If

 '-- find the correct hiSwap
 If (ignoreCase = 1 And ignoreWhitespace = 1) Then
 While ucase(trim(vec(hiSwap))) > ucase(trim(pivot))
     hiSwap = hiSwap - 1
 wend
 Elseif (ignoreCase = 1) Then
 While ucase(vec(hiSwap)) > ucase(pivot)
    hiSwap = hiSwap - 1
 wend
 Elseif (ignoreWhiteSpace = 1) Then
 While trim(vec(hiSwap)) > trim(pivot)
    hiSwap = hiSwap - 1
 wend
 Else
 While vec(hiSwap) > pivot
    hiSwap = hiSwap - 1
 wend
 End If

 '-- swap values if out of order
 If loSwap < hiSwap Then
 temp = vec(loSwap)
 vec(loSwap) = vec(hiSwap)
 vec(hiSwap) = temp
 End If
 loop While loSwap < hiSwap

 vec(loBound) = vec(hiSwap)
 vec(hiSwap) = pivot

 '-- Recursively sort the partitions

 '-- if there are 2 or more items in first partitions
 If loBound < (hiSwap - 1) Then
 Call QuickSort(vec,loBound,hiSwap-1,ignoreWhiteSpace, ignoreCase)
 End If

 '-- 2 or more items in second section
 If hiSwap + 1 < hibound Then
 Call QuickSort(vec,hiSwap+1,hiBound, ignoreWhiteSpace, ignoreCase)
 End If

End Sub



Sub SortLines
'DESCRIPTION: Sorts the selected lines
 If ActiveDocument.Type <> "Text" Then
    MsgBox "This macro can only be run when a text editor window is active."
 Exit Sub
 End If

 '-- make sure the top of the selection is really the top
 StartLine = ActiveDocument.Selection.TopLine
 EndLine = ActiveDocument.Selection.BottomLine
 If EndLine < StartLine Then
 Temp = StartLine
 StartLine = EndLine
 EndLine = Temp
 End If

 '-- collect the lines of the selection into an array
 lines = CollectLines(ActiveDocument.Selection)


 If isnull(lines) Then
 Exit Sub
 End If

 If (ubound(lines) <= 0) Then
    '-- don't try to sort an empty selection
    Exit Sub
 End If

 '-- sort the array
 Call QuickSort (lines, lbound(lines), ubound(lines), 0, 0)

 '-- select the entire original selection, then delete it
 ActiveDocument.Selection.GoToLine StartLine

 ActiveDocument.Selection.LineDown dsExtend, (EndLine - StartLine) + 1
 ActiveDocument.Selection.Delete

 '-- write the sorted lines out to the file
 For i = 0 To EndLine - StartLine
 ActiveDocument.Selection = lines(i)
 Next
End Sub



Comments

  • Output faster -> handy for undo

    Posted by Legacy on 01/29/2003 12:00am

    Originally posted by: Stephan

    If you want to undo your sorting, you have to undo every line manually.

    If you change the last bit of code into this, the output it written in 1 statement, allowing to do a undo in one ctrl-Z

    Change the last bit into:

    ActiveDocument.Selection.LineDown dsExtend, (EndLine -StartLine) + 1

    '-- write the sorted lines out to the file
    For i = 0 To EndLine - StartLine
    Outline = Outline + lines(i)
    Next
    ActiveDocument.Selection = Outline

    Reply
  • Won't work for large number of lines

    Posted by Legacy on 07/26/2001 12:00am

    Originally posted by: Dave

    I tried to sort my string table which is 635 lines and this macro failed. If I separated the job into two jobs of ~300 lines each, it worked fine. But now I have two sorted lists of ~300 lines when I need a whole list of 635 sorted.

    Reply
  • Works fine

    Posted by Legacy on 12/01/2000 12:00am

    Originally posted by: Rainer

    With the correction it works fine. Thank you very much. I have always been surprised that MS has forgotten such a useful feature.
    
    

    Rainer

    Reply
  • Correction

    Posted by Legacy on 05/19/1999 12:00am

    Originally posted by: Chad Loder

    I don't know whether the syntax error was my fault
    
    or that of the HTML formatting, but the line should
    read:

    For i = StartLine To EndLine
    Selection.GoToLine i
    Selection.SelectLine
    lines(endLine - i) = Selection.Text
    Next

    That is, replace the > (greater than symbol) with a
    newline.

    Thanks for finding it.

    chad

    Reply
  • Syntax error

    Posted by Legacy on 05/19/1999 12:00am

    Originally posted by: Art Metz

    FYI - This line has a syntax error:
    For i = StartLine To EndLine > Selection.GoToLine i

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • The explosion in mobile devices and applications has generated a great deal of interest in APIs. Today's businesses are under increased pressure to make it easy to build apps, supply tools to help developers work more quickly, and deploy operational analytics so they can track users, developers, application performance, and more. Apigee Edge provides comprehensive API delivery tools and both operational and business-level analytics in an integrated platform. It is available as on-premise software or through …

  • Java developers know that testing code changes can be a huge pain, and waiting for an application to redeploy after a code fix can take an eternity. Wouldn't it be great if you could see your code changes immediately, fine-tune, debug, explore and deploy code without waiting for ages? In this white paper, find out how that's possible with a Java plugin that drastically changes the way you develop, test and run Java applications. Discover the advantages of this plugin, and the changes you can expect to see …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds