SortKeyToMultiValueContainer C++ Container Template

In this article....

  • Introduction
  • Compatibility
  • What it does
  • What it does NOT
  • Background
  • Motivation
  • How to define the value, the key and, the
  • container class and the cursor
  • How to insert the values
  • How to scan the entire set of values
  • How to search for a key and scan the set of result values
  • Source Code
  • Test
  • History
  • Author

Figure 1: The sample project runs a set of tests to check source code correctness

Introduction

This article proposes my personal approach to the problem of storing and retrieving data in memory through the lmtl::SortKeyToMultiValueContainer class template.

Don't expect to find a replacement of STL, but a class template written to solve a special problem and that can be useful in building generic data structures.

I wrote some tests to check the software and it seems to work fine, but I dont' take any responsibility: Use it at your own risk.

Compatibility

I compiled and tested the source code file html.h under both Microsoft Visual C++ 6.0 and Borland C++ Builder 5.0.

The source did not require any change.

What It Does

Here is a list of the functionalities of lmtl::SortKeyToMultiValueContainer in version 1.0.

Description
Stores data in memory ("by value" copy)
Maintains data sorted by a user definable key, its operator<() and its operator==()
Maintains a match between keys and multiple values
Keeps the data structure efficient using a new internal balancing algorithm
Reads ascendently the data stored
Inserts data whenever needed
Finds all the data matching a given key
Reads ascendently the found data from start to end
In case of equality between keys, keeps the insertion order in returning the results

What It Does NOT Do

The following is obviously only a partial list. There would be too many the things that this class does not do to include them all.

Description Comment
It does not delete data Deletion is not implemented yet, only the creation of the data structure, growing with insertion, ascending read, and find with ascending read.
It does not read sequentially in reverse order (in descending order) Bidirectional iteration is not yet implemented, only forward (ascending).

Background

In C++, there are many data structures available, such as arrays, vectors, lists, binary trees, hash tables, maps, multimaps, and so on. Every data structure has its own advantages, disadvantages, and limitations.

Now, let's talk about "multimap." A multimap is a container that keeps the following in memory:

  1. a set of keys ki
  2. a set of values vi,j
  3. a set of links between the keys and the values

Figure 2: keys, values, and links in a multimap.

In a multimap, you can insert data pairs (k,v), scan the data sorted by key, delete the data, and retrieve a value v given its key k.

SortKeyToMultiValueContainer is a template class very similar to a multimap.

With SortKeyToMultiValueContainer, you can easily implement a lot of data structures because you can manage the "functional dependencies" that are links (or matches) between data. The concept of "functional dependency" is based the relational algebra and thus all the relational databases.

With SortKeyToMultiValueContainer, you can read the values sequentially. If we assume that:

  1. k1 < k2
  2. k2 < k3
  3. the pair (k1,v1,2) was inserted after the pair (k1,v1,1)
  4. the pair (k3,v3,2) was inserted after the pair (k3,v3,1)
  5. the pair (k3,v3,3) was inserted after the pair (k3,v3,2)

then we can read the values sequentially:

  1. v1,1
  2. v1,2
  3. v2,1
  4. v3,1
  5. v3,2
  6. v3,3

To read sequentially, we use a cursor: SortKeyToMultiValueContainer::Cursor. Cursors are also called iterators. With a cursor, we can read the entire set of values sequentially.

Figure 3: scanning the entire set of values

We can also make a search by a given key, retrieve the result, and scan it.

Figure 4: scanning the result of a search

SortKeyToMultiValueContainer C++ Container Template

Motivation

Have you ever tried to manage, in C++, a collection of data in memory? Did you try STL or some other container template library? Have you ever had problems with the library, tried to debug it, and discovered that is so sophisticated, generic, compact, and well written that it turns out to be incomprehensible when you try to single-step in it? And the doubt that you are not using it the right way, or that a patch exists somewhere, did you have it?

I was developing a simple "software engineering tool" to be used to manage a software model. I used STL multimaps and maps. Everything seemed to work fine until I discovered that I missed some data. I lost a couple of days trying in vain to debug the software, so I began to have doubts: Perhaps the library version in this compiler is not updated; perhaps I'm doing something wrong; perhaps there is something that I haven't understood; perhaps there is a cast operator that turns my variable into something that I didn't expect to exist but is perfectly correct syntactically.

So, I thought that the only way to make it work was to write the container class template by myself, to put my own mistakes in it, and to correct them. I don't mean that STL has bugs. I only know that by using this class, my software, that is based on it, now works. It doesn't matter why it didn't work before.

I wrote this class using a binary tree and a new algorithm to keep it balanced. I also wrote a set of functions to test the correctness of the software and its speed. I think that now it works fine and is acceptably fast.

Suggestions are welcome.

How to Define the Value, Key, Container Class, and Cursor

//
// lmtl_how_to_define.h
//


//
// how to define the value, the key, the container, and the cursor
//
class myValue {
  public:
    myValue(){m_value=NULL;};    // default constructor is mandatory
    myValue(const char * value){m_value=value;}; // convenient,
                                                 // not mandatory

    const char * m_value;    // your value data here

};

class myKey {
      public:
      myKey(){m_key=-1;};    // default constructor is mandatory
      myKey(int value){m_key=value;}; // convenient,
                                      // not mandatory
      bool operator==(const myKey &toCompare) const  // mandatory
          {return m_key==toCompare.m_key;}; 
      bool operator<(const myKey &toCompare) const   // mandatory
           {return m_key<toCompare.m_key;}; 

     int m_key;    // your key data here
};

#include "lmtl.h"
typedef lmtl::SortKeyToMultiValueContainer<myKey,myValue>
        myContainer;
typedef lmtl::SortKeyToMultiValueContainer<myKey,
        myValue>::Cursor myCursor;


void doIt();

How to Insert the Values

//
// lmtl_how_to_work.cpp
//

#include<ostream.h>    // needed only to print to console
#include "lmtl_how_to_define.h"

void doIt()
{
   cout<<endl<<endl<<"Let's run the example"<<endl;

   //
   // how to insert the values
   //

   myContainer container;           // create the container

   container.Insert(1,"V(1,1)");    // insert the data
   container.Insert(1,"V(1,2)");    // with convenient
                                    // constructors
   container.Insert(2,"V(2,1)");    // you don't have to make many
   container.Insert(3,"V(3,1)");    // class instances and
   container.Insert(3,"V(3,2)");    // fill them with data
   container.Insert(3,"V(3,3)");

SortKeyToMultiValueContainer C++ Container Template

How to Scan the Entire Set of Values

   //
   // how to scan the entire set of values
   //

   myCursor cursor(container);

   cout<<"entire set of values"<<endl;

   while(cursor.Valid()) {                   // test we are not at
                                             // the end
      cout<<cursor.Value().m_value<<endl;    // extract the value
      cursor++;                              // move to next value
   }

How to Search for a Key and Scan the Set of Result Values

   //
   // How to search for a key and scan the set of result values
   //

   myCursor result=container.Find(3);

   cout<<"set of result values"<<endl;

   while(result.Valid()) {                   // test we are not at
                                             // the end
      cout<<result.Value().m_value<<endl;    // extract the value
      result++;                              // move to next value
   }

How to Count the Number of Values

   //
   // how to count the number of values
   //

   // count the whole size
   cout<<"there are "<<container.Size()<<" values in the container"
       << endl;

   // count the number of results for the key 3
   cout<<"there are "<<container.Find(3).SizeToScan()<<" values for
          the key 3" << endl;


}

SortKeyToMultiValueContainer C++ Container Template

Source Code

The following is the complete source code of the SortKeyToMultiValueContainer template class.

//
//
// LMTL.H v. 1.0
//
// implementation of SortKeyToMultiValueContainer template
//
// use it at your own risk
//
//
// L U C A   M I S S O R A   2 0 0 3
//
// L M I S S O R A @ H O T M A I L . C O M
//
// L U C A @ M I S S O R A . I T
//
//
// Documentation at http://www.missora.it/
//    sort_key_to_multi_value_container_template_index.html
//
#ifndef LMTL_H
#define LMTL_H
namespace lmtl {

#ifndef NULL
   #define NULL 0
#endif

   class exception_memory {
   };

   class exception_correct {
   };

   namespace test {
      class pair {public: int k;int v;};
      enum ETest {EFillAndClear,EScan,EFind};
      void correct();    // tests to check correctness of the code
//      void speed();    // tests to check speed of the code

      void treenode(enum ETest which,int num_elem,bool random,
                    int repetitions);
      void SortKeyToMultiValueContainer(enum ETest which,
                                        int num_elem,
                                        int repetitions);

      void fill_example_data(int *where,bool random_or_sequential,
                             bool distinct_values,int num_elem,
                             int max_value);
      int compare_int( const void *arg1, const void *arg2 );
      int compare_pair( const void *arg1, const void *arg2 );
   }

template <class val>
class listnode {
   public:
      listnode(){
         m_next=NULL;
      }
      listnode(const val &v){
         m_next=NULL;
         m_val=v;
      }
      ~listnode(){
            free();
      }
      const val & value() const {
         return m_val;
      }

      void operator<<(val v) {
         listnode *n=new listnode<val>;
         if(n==NULL)
            throw exception_memory();
         n->m_val=v;
         operator<<(n);
      }
      listnode *next() const {
         return m_next;
      }
      private:
         val m_val;
      public:
         long size(){
         long ret=1;
         if(m_next==NULL)
            return ret;
         listnode *currentl=this;
         for(;;) {
            if(currentl->m_next==NULL)
               break;
            currentl=currentl->m_next;
            ret++;
         }
         return ret;
      }
   private:
      listnode *m_next;
      void free(){
         if(m_next==NULL)
            return;
         listnode *currentl=m_next;
         listnode *nextl;
         while(currentl!=NULL) {
            nextl=currentl->m_next;
            currentl->m_next=NULL;
            delete currentl;
            currentl=nextl;
         }
      }
      void operator<<(const listnode * n) {
         if(m_next!=NULL)
            *m_next<<n;
         else
            m_next=(listnode *)n;
      }
};


template <class val>
class treenode {
   public:
      treenode() {
         m_list_first=NULL;
         m_list_last=NULL;
         m_parent=NULL;
         m_less=NULL;
         m_more=NULL;
         m_more_count=0;
         m_less_count=0;
      }
      ~treenode(){
            clear();
      }
      const val & value() const {
         return m_list_last->value();
      }
      void check() const {
#ifdef LMTL_DEBUG
         if(m_list_first!=NULL) {
            if(m_list_last==NULL)
               throw exception_correct();
            if(m_list_first->next()!=NULL &&
               m_list_last==m_list_first)
               throw exception_correct();
         }

         if(this==NULL)
            throw exception_correct();
         if((m_less==NULL) != (m_less_count==0))
            throw exception_correct();
         if((m_more==NULL) != (m_more_count==0))
            throw exception_correct();
         if((m_list_first==NULL) != (m_list_last==NULL))
            throw exception_correct();
         if(m_less!=NULL) {
            if(m_less_count!=m_less->m_more_count+m_less->
               m_less_count+m_less->m_list_first->size())
               throw exception_correct();
            if(m_less->m_parent!=this)
               throw exception_correct();
            if(m_list_first->value() < m_less->
               m_list_first->value() || m_less->
               m_list_first->value()==m_list_first->value())
               throw exception_correct();
         }
         if(m_more!=NULL) {
            if(m_more_count!=m_more->m_more_count+m_more->
               m_less_count+m_more->m_list_first->size())
               throw exception_correct();
            if(m_more->m_parent!=this)
               throw exception_correct();
            if(m_more->m_list_first->value()<
               m_list_first->value() ||
               m_more->m_list_first->value()==m_list_first->value())
               throw exception_correct();
         }
#endif    // LMTL_DEBUG
      }
      void recur_check() const {
#ifdef LMTL_DEBUG
         check();
         if(m_less!=NULL)
            m_less->recur_check();
         if(m_more!=NULL)
            m_more->recur_check();
#endif    // LMTL_DEBUG
      }

      void operator<<(const val &v) {
         bool optimized=false;
         if(m_list_last==NULL) {
            m_list_last=new listnode<val>(v);
            if(m_list_last==NULL)
               throw exception_memory();
            if(m_list_first==NULL)
               m_list_first=m_list_last;
            return;
         }
         treenode *current=this;
         for(;;) {
            if(v<current->m_list_first->value()) {
               if(current->m_less!=NULL) {
                  if(optimized || ! current->m_less->
                     optimize(current)) {// (*)
                     optimized=false;
                     current->m_less_count++;
                     current=current->m_less;
                  }
                  else
                     optimized=true;
                  // else repeats loop
               }
               else {
                  current->m_less_count++;
                  current->m_less=new treenode;
                  if(current->m_less==NULL)
                     throw exception_memory();
                  current->m_less->m_parent=current;
                  *(current->m_less)<<v;
                  return;
               }
            }
            else if(v==current->m_list_first->value()) {
               *(current->m_list_last)<<v;
               current->m_list_last=current->m_list_last->next();
               return;
            }
            else {
               if(current->m_more!=NULL) {
                  
                  static long dbgcounter=0;
                  dbgcounter++;


                  if(optimized || ! current->m_more->
                     optimize(current)) { // (*)
                     optimized=false;
                     current->m_more_count++;
                     current=current->m_more;
                  }
                  else
                     optimized=true;
                  // else repeats loop
               }
               else {
                  current->m_more_count++;
                  current->m_more=new treenode;
                  if(current->m_more==NULL)
                     throw exception_memory();
                  current->m_more->m_parent=current;
                  *(current->m_more)<<v;
                  return;
               }
            }
         }
      }

      void operator<<(treenode *n) {
         check();
         n->check();
         treenode *current=this;
         for(;;){
            if(n->m_list_first->value()<current->
                  m_list_first->value()) {
               current->m_less_count+=n->m_more_count+n->
                        m_less_count+n->m_list_first->size();
               if(current->m_less!=NULL) {
                  current=current->m_less;
               }
               else {
                  current->m_less=n;
                  n->m_parent=current;
                  return;
               }
            }
            else if(n->m_list_first->value()==current->
                       m_list_first->value()) {
               throw exception_correct();
            }
            else {
               current->m_more_count+=n->m_more_count+n->
                        m_less_count+n->m_list_first->size();
               if(current->m_more!=NULL)
                  current=current->m_more;
               else {
                  current->m_more=n;
                  n->m_parent=current;
                  return;
               }
            }
         }
      }

      treenode * find(val v) {
         bool optimized=false;
         if(this==NULL)
            throw exception_correct();
         treenode *current=this;
         for(;;) {
            current->check();
            if(v<current->m_list_first->value()) {
               if(current->m_less==NULL)
                  return NULL;
               if(optimized || ! current->m_less->
                  optimize(current)) {
                  optimized=false;
                  current=current->m_less;  //should always be
                                            //called without optimize
               }
               else
                  optimized=true;
               // else repeats loop
            }
            else if(v==current->m_list_first->value()) {
               return current;
            }
            else {
               if(current->m_more==NULL)
                  return NULL;
               if(optimized || ! current->m_more->
                  optimize(current)) {
                  optimized=false;
                  current=current->m_more;  //should always be
                                            //called without optimize
               }
               else
                  optimized=true;
               // else repeats loop
            }
         }
      }

      long depth() const {
         long l=0;
         long m=0;
         if(m_less!=NULL)
            l=m_less->depth();
         if(m_more!=NULL)
            m=m_more->depth();

         return m>l ? m+1 : l+1;
      }

      long size() const {
         long l=0;
         long m=0;
         if(m_less!=NULL)
            l=m_less->size();
         if(m_more!=NULL)
            m=m_more->size();

         return m_list_first->size()+l+m;
      }
   public:
      listnode<val> *m_list_first;
      listnode<val> *m_list_last;
      treenode *m_parent;
      treenode *m_less;
      treenode *m_more;
      long m_less_count;
      long m_more_count;

   private:
      void insert_nodes(treenode *current) {
         treenode *less=current->m_less;
         treenode *more=current->m_more;
         current->m_less=NULL;
         current->m_more=NULL;
         current->m_less_count=0;
         current->m_more_count=0;
         operator<<(current);
         if(less!=NULL)
            insert_nodes(less);
         if(more!=NULL)
            insert_nodes(more);
      }
      void unlink_parent_until(treenode *far_parent) {
         if(m_parent!=NULL) {
            long deleted_nodes=m_more_count+m_less_count+
                               m_list_first->size();
            treenode *current=m_parent;
            if(m_parent->m_more==this) {
               m_parent->m_more=NULL;
               m_parent->m_more_count=0;
            }
            else if(m_parent->m_less==this){
               m_parent->m_less=NULL;
               m_parent->m_less_count=0;
            }
            else
               throw exception_correct();
            m_parent=NULL;
            while(current!=far_parent) {
               if(current->m_parent->m_more==current) {
                  current->m_parent->m_more_count-=deleted_nodes;
               }
               else if(current->m_parent->m_less==current){
                  current->m_parent->m_less_count-=deleted_nodes;
               }
               else
                  throw exception_correct();
               current=current->m_parent;
            }
         }
      }
   public:
      void clear(){
         treenode *current=this;
         treenode *next;
         for(;;){
            if(current==NULL)
                  throw exception_correct();
            if(current->m_list_first!=NULL) {
               delete current->m_list_first;
               current->m_list_first=NULL;
               current->m_list_last=NULL;
            }
            if(current->m_less!=NULL) {
               next=current->m_less;
               current->m_less=NULL;
               current=next;
            }
            else if(current->m_more!=NULL) {
               next=current->m_more;
               current->m_more=NULL;
               current=next;
            }
            else {
               if(current==this) {
                  current->m_less_count=0;
                  current->m_more_count=0;
                  break;
               }
               if(current->m_parent==NULL)
                  return;
               next=current->m_parent;
               current->m_parent=NULL;
               current->m_less_count=0;
               current->m_more_count=0;
               delete current;
               if(next==NULL)
                  throw exception_correct();
               current=next;
            }
         }
      }
   private:
      bool optimize(treenode *parent) {

         long more_count=m_more_count;
         long less_count=m_less_count;
         treenode *current=this;

         //static unsigned char s_scaler=0;
         //if(s_scaler++ & 0x01!=0)
         //   return false;

         while(((more_count>>1)>less_count) ||
               ((less_count>>1)>more_count)) {

            if(current->m_more_count>current->m_less_count) {
               if(current->m_more==NULL)
                  break;
               less_count+=current->m_more->m_less_count +
                           current->m_list_first->size();
               more_count+=(-current->m_more->m_less_count) +
                           (-current->m_more->m_list_first->size());
               current=current->m_more;
            }
            else{
               if(current->m_less==NULL)
                  break;
               more_count+=current->m_less->m_more_count+
                           current->m_list_first->size();
               less_count+=(-current->m_less->m_more_count) +
                           (-current->m_less->m_list_first->size());
               current=current->m_less;
            }
         }


         if(current!=this) {
            if(current==m_less)
               return false;
            if(current==m_more)
               return false;

            current->unlink_parent_until(parent);

            treenode *less=m_less;
            treenode *more=m_more;
            if(less!=NULL )
               less->unlink_parent_until(parent);

            if(more!=NULL)
               more->unlink_parent_until(parent);

            this->unlink_parent_until(parent);

            parent->insert_nodes(current);

            if(less!=NULL ) {
               parent->insert_nodes(less);
            }
            if(more!=NULL) {
               parent->insert_nodes(more);
            }
            parent->insert_nodes(this);
            parent->recur_check();
            check();
            current->check();

            return true;
         }
         return false;
      }
   public:

   class Cursor
   {
   public:
      Cursor(const treenode *t=NULL,const listnode<val> *l=NULL) {
         m_treenode=(treenode *)t;
         m_listnode=(listnode<val> *)l;
      }

      Cursor(const treenode & t) {
         m_treenode=(treenode *)&t;
         m_listnode=t.m_list_first;
         go_first();
      }
      void go_first(){
         while(m_treenode->m_less!=NULL)
            m_treenode=m_treenode->m_less;
         m_listnode=m_treenode->m_list_first;
      }
      void next_node() {
         if(m_treenode->m_more!=NULL){
            m_treenode=m_treenode->m_more;
            m_listnode=m_treenode->m_list_first;
            go_first();
            return;
         }
         while(m_treenode->m_parent!=NULL)
           if(m_treenode->m_parent->m_less==m_treenode) {
            m_treenode=m_treenode->m_parent;
            m_listnode=m_treenode->m_list_first;
            return;
           }
           else
            m_treenode=m_treenode->m_parent;
         m_treenode=NULL;
         m_listnode=NULL;
      }
      void operator++(int) {
         if(m_listnode->next()!=NULL) {
            m_listnode=m_listnode->next();
            return;
         }
      next_node();
      }
      bool valid() const {
         return m_treenode!=NULL && m_listnode!=NULL;
      }
      val Value() const {
         m_treenode->check();
         if(!valid())
            throw exception_correct();
         return m_listnode->value();
      }
      treenode<val> * m_treenode;
      listnode<val> * m_listnode;
   };


};


template <class key,class val>
class SortKeyToMultiValueContainer {
   private:
      class pair {
        public:
           pair() {};
           pair(const key & k, const val &v){
              m_key=k;
              m_val=v;
           }
           bool operator<(const pair &p) const {
              return m_key<p.m_key;
           }
           bool operator==(const pair &p) const {
              return m_key==p.m_key;
           }
           key m_key;
           val m_val;
      };
   public:
      void Insert(const key & k, const val &v) {
         pair p(k,v);
         m_node<<(p);
      }

      void Clear() {
         m_node.clear();
      }

   public:    // not so fine to be public here...
      treenode<pair> m_node;

   public:

   class Cursor
   {
   public:
      Cursor() {}

      Cursor(const SortKeyToMultiValueContainer & p) :
            m_cursor(p.m_node) {m_filter_key=false;}
      Cursor(const key & k,treenode<pair> *node) :
            m_cursor(node,node!=NULL ? node->m_list_first : NULL)
            {m_key=k;m_filter_key=true;}
      Cursor(const Cursor &c){m_cursor=c.m_cursor;m_key=c.m_key;
                              m_filter_key=c.m_filter_key;};

      void operator++(int) {m_cursor++;}
      bool Valid() const {return m_cursor.valid() &&
           (!m_filter_key || m_key==m_cursor.Value().m_key);}
      val Value() const {return m_cursor.Value().m_val;}
      long SizeToScan() const
         {
            long retval=0;
            treenode<pair>::Cursor c(m_cursor);
            while(c.valid() && (!m_filter_key ||
                 m_key==c.Value().m_key)){
               retval+=c.m_listnode->size();
               c.next_node();
            }
            return retval;
         }

      treenode<pair>::Cursor m_cursor;
      key m_key;
      bool m_filter_key;
   };

   public:
      Cursor Find(const key & k) {
         val v;
         pair p(k,v);
         treenode<pair> *n=m_node.find(p);
         Cursor c(k,n);
         return c;
      }
                Cursor operator[](const key & k) {
                        return Find(k);
                }
      long Size() {
         return m_node.size();
      }


};



}

#endif

Test

The sample project contains a set of tests I wrote either to check the correctness of the source or to improve its speed.

Version History

Current release is 1.0 - 18 January 2004

About the Author




Downloads

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Live Event Date: September 19, 2014 @ 2:00 p.m. ET / 11:00 a.m. PT In response to the rising number of data breaches and the regulatory and legal impact that can occur as a result of these incidents, leading analysts at Forrester Research have developed five important design principles that will help security professionals reduce their attack surface and mitigate vulnerabilities. Check out this upcoming eSeminar and join Chris Sherman of Forrester Research to learn how to deal with the influx of new device …

  • The first phase of API management was about realizing the business value of APIs. This next wave of API management enables the hyper-connected enterprise to drive and scale their businesses as API models become more complex and sophisticated. Today, real world product launches begin with an API program and strategy in mind. This API-first approach to development will only continue to increase, driven by an increasingly interconnected web of devices, organizations, and people. To support this rapid growth, …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds