Environment: Developed in (but not restricted to) VC6
This article give a brief, non-academic (=no dwelling into time complexity
issues), explanation of what binary trees are about. It also provide source and demo
of a generic tree class. Functionality to balance the tree is also
described/provided.
About binary trees
Say you have a list or array holding N numbers of elements, and want to
search for a specific element. You then have to go through (aka traverse) each element one by one until you find it (or find out that it is not in the list).
The worst case (if the element is last in the list, or isn’t there at all)
you’d have to make N comparisons, which can be quite a treat if the list/array is
VeryVeryBig(tm).
By having the elements stored in a tree rather a list you get some key
benefits:
- Faster search (don’t have to compare with all elements)
- Elements are automatically sorted as they are added
How does it work then
In a linked list you have a line of elements, each elements pointing to the
next:
Fig. 1. A linked list
In a binary tree, each element instead point to two other elements, one
“down to the left” (left child) and one “down to the right” (right child).
The left child’s key value is less than its parent’s, and the right child’s
key value is greater than or equal than its parent’s:
Fig. 2. A binary tree
Thanks to this, can you fairly easy decide where to search (and where to not
search) for a particular element.
If the current node doesn’t match the search criteria, query the left or the
right child, depending of wheter the search criteria is < or >= than current
node’s key value.
Sorting
Since all elements are stored according to their key values, you can easily
retrieve them in an ordered (or reversed) manner. Oh, and when working with trees,
you really see the benefits of using recursion…
Example of printing the elements in order:
PrintElement(pTopMostElement)
.
.
void PrintElement(TreeElement* pElement)
{
if (pElement)
{
PrintElement(pElement->LeftChild)
pElement->PrintMe()
PrintElement(pElement->RightChild)
}
}
Example of printing the elements in reversed order:
PrintElementReversed(pTopMostElement)
.
.
void PrintElementReversed(TreeElement* pElement)
{
if (pElement)
{
PrintElementReversed(pElement->RightChild)
pElement->PrintMe()
PrintElementReversed(pElement->LeftChild)
}
}
Bringing balance to the force
Now…there is a slight catch of course :-). The order in which you add
elements to the tree affect how the tree looks.
Lets say we have a bunch or integer elements 1..9
If they are added to the tree in this order: 3,6,4,8,1,9,2,7,5, you’d get a
tree looking somewhat like this:
Fig. 3. A tree of integers
BUT if they instead are added in order (1,2,3,4,5,6,7,8,9) you get a tree
looking like this, which pretty much is a linked list that has no benefits of
actually supporting a tree structure.
Fig. 4. Another “tree” of integers.
This is where the topic of balance come in, the more evenly distributed the
elements are, the better the tree is balanced. Fig 3 above shows a somewhat balanced
tree, fig. 4 shows an utterly unbalanced tree.
So, how do you keep the tree balanced? Well you could…
(1) Add elements in a un-ordered fashion.
Humm, not really an option since you just about never can control what’s
input and when.
(2) Randomize the tree.
You give the element to insert a random key value, insert it and give it
back its original key value. Then you let it “rotate” up until it is at a valid
position. The mechanism is called “Randomized tree”. If you want to know
more about this, attend a basic computer science class ;-).
(3) Rebuild the entire tree
More about this below.
There are probably more variants, but I’ll illustrate one way of Rebuild
the entire tree, making it balanced. This is also how it is done in the source code provided.
Rebuild the entire tree
So…you have a tree you want to balance…here’s one way to do it:
(1) Copy the elements to an array so that the array holds them in
(ascending) order.
(2) Clear the tree.
(3) Add elements from the array, in a highly “un-ordered” manner.
(1) is easily done. As mentioned, a tree is sorted “by default”, its just a
matter of traversing properly. If we are using pointers to the elements (we are) we
don’t copy the elements, but just copy the pointers.
(2) is also quite easy, simply remove the top most element.
(3) now this is a tad bit tricky….what do we mean by “un-ordered”. Well,
anyway, we want the “middle-most” element to be on the top, and then the middle-most of the left portion to be left child and the middle-most of the right portion to be right child etc, kinda like this:
Fig. 5. From an ordered array to a tree.
Hmmmm…feels like something about recursion is afoot:
// Assuming array ranges from [0..arraySize-1]
GetFromOrderedArray(0,arraySize-1)
.
.
void GetFromOrderedArray(int lowBound,int highBound)
{
if (hi < low) return;
middlePos = lowBound+(highBound-lowBound)/2
// middlePos is now at the element in the middle
// between lowBound and highBound, so we just add
// it to the treeAddElement ( theOrderedArray[middlePos] )
// Pick the middle one “to the left”
AddFromOrderedArray(lowBound,middlePos-1)
// Pick the middle one “to the right”
AddFromOrderedArray(middlePos+1,highBound)
}
Deleting elements
As you probably realize can you not just delete an element and get away with
it, the child elements would then be lost in cyberspace. First of all, if you’re
gonna delete the Element E, you will have to do something to E’s parent, like
NULLing its child reference to E. There are essentially two ways of getting the parent
of an element:
(1) Traverse until you find the element whose LeftChild or RightChild
== E, or
(2) Letting all elements have a reference to their parent set when
the node is
inserted into the tree (this is how its done in the provided code).
So, how to remove the element, E? One quite simple solution is to:
(1) Disconnect E from its parent.
(2) Add E’s left and right child to the tree in the same manner any
other elements are added.
(3) Zap E
This will work if the adding doesn’t tamper with the inserted element’s
children (no reason it should really). Remember we haven’t done anything with E’s
children, so they will still hold their sub-tree structures (if any).
Example:
void RemoveElement(TreeElement* theOneToRemove)
{
TreeElement* pParent = theOneToRemove->GetParent();
// Ok, so it has a parent, then we’ll simply just disconnect it.
if (pParent)
{
if (pParent->GetLeftChild() == theOneToRemove)
{
pParent->SetLeftChild(NULL);
}
else
{
ASSERT(pParent->GetRightChild() == theOneToRemove);
pParent->SetRightChild(NULL);
}
}
else
{
// No parent? Then we’re removing the root element.
theTopMostElement = NULL;
}
// Disconnected, now we reconnect its children (if any)
// just by adding them as we add any other node.
if (theOneToRemove->GetLeftChild())
AddElement(theOneToRemove->GetLeftChild());
if (theOneToRemove->GetRightChild())
AddElement(theOneToRemove->GetRightChild());
//Zap the element (if that’s what you want to do)
delete theOneToRemove;}
Hmmm…that’s about it. For more details, check the demo project’s source
code.
Downloads
The demo project is a console application holding the source (commented) to
a generic tree structure and some stuff illustrating its use. It also shows how to
have a quite flexible tree traversing mechanism using callback functions.
Key files:
BinTree.h /.cpp : Definition of CBinTreeNode and CBinTree classes wich are
the core bin tree classes.
StringIntTree.h/.cpp : Definition of classes inheriting above mentioned
classes. Holds a tree of elements with a string (key value) and an integer.
BinTreeDemo.cpp : Builds a small tree and performs various stuff on it.