SHARE

HybridList – A fast N lg (N) sort algorithm for lists

Management of large dictionaries is always a big problem. That’s why I decided to create an efficient and simple algorithm to manage very long sorted lists. The HybridList algorithm resembles skip-lists but it differs considerably in basic principles and memory usage. I hate the idea of allocating memory blocks for an array of pointers and […]

Written By

CodeGuru Staff

Mar 28, 1999

2 minute read

CodeGuru content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Management of large dictionaries is always a big problem. That’s why I decided to
create an efficient and simple algorithm to manage very long sorted lists.
The HybridList algorithm resembles skip-lists but it differs considerably in basic
principles and memory usage. I hate the idea of allocating memory blocks for
an array of pointers and predicting the number of nodes in a nonexistent list.

Main features
Concept
Simple example
Speed
Source

Main features

Search, insert, delete operations require O(lg n) comparisons in
all cases
As every insertion sort it is stable – it retains the original ordering
of identical keys
It is in-place algorithm – no additional memory or stack is required
Data can be linked in simple as well as double-linked
list
Only 4 bytes for ‘Next’ pointer are required in user data structure

Concept

There are two types of nodes: user bottom-nodes (blue circles) and top-nodes (green
rectangles) that create a kind of hierarchy. When user adds/removes bottom-node
to/from the list, ‘parent’ top-node on the first level is checked to keep 10-30 bottom
nodes. If node has more than 30 nodes, then new node is created and both nodes have 15
subnodes. If node has less than 10 subnodes, its right neighbor is removed. Therefore
hierarchy is always quite balanced.

QList_scheme.gif (6718 bytes)

Simple example

  #include "HybridList.h"

  class SampleNode : public HListNode <SampleNode>
  {
    // defines Prev,Next members
    public:
      int Value;  // random value to sort
      SampleNode() { Value = rand(); }
      bool operator ==(SampleNode& comp) {return Value ==comp.Value;}
      bool operator > (SampleNode& comp) {return Value > comp.Value;}
      bool operator < (SampleNode& comp) {return Value < comp.Value;}
  };

  bool TEST()
  {
    HybridList<SampleNode> List;
    for (int i=0; i < 1000; i++)
        List.InsertSorted (new SampleNode);
    SampleNode to_find;
    SampleNode * first_equal = List.FindEqual(to_find);
    while (true)
        {
          SampleNode * head = List.RemoveHead();
          if (!head)
             break; // end of list achieved
          printf ("%ld->",head->Value);
          delete head;
        }
    return true;
  }

Speed

There is no sense to compare O(lg n) algorithm with O(n) algorithm. But even O(lg
n) algorithms, like binary trees, red-black trees, skip-lists have different speed due to
balancing techniques, memory usage, comparison function, etc. My preliminary tests of
red-black tree, that seems to be the fastest algorithm for dictionaries, shows that
HybridList is more than twice as fast. But it seems to me to be not so
important with regard to memory usage issue: HybridList node require only additional 4
byte for ‘Next’ pointer and red-black node require 12-16 bytes.

More details you can find on http://members.tripod.com/~DanKozub.
Timing diagram and comparison with Quicksort and test exe-file are also available there.

Source

The algorithm is implemented in C++ . In principle it does not require any additional
libraries or system support. But I think some insignificant changes will be needed to make
it run on other platforms than Windows 95/NT.

Download source HybridList.zip (7 KB)