How C++ Implements Late Binding

Function Call Binding

Binding a function necessarily means connecting the point of function invocation to its body. This binding can happen two ways: statically or dynamically. When the binding is done before the program actually executes, it is said to be early/statically bounded. This type of binding is done during compilation by the compiler and linker. The compiler has full knowledge of the function definition and the places where this function is called in the program. The C language typically uses this type of binding. But, there are situations where static binding does not serve the purpose, late binding is sought. The late binding, therefore, means that the binding of the function invocation to its definition happens at runtime on the basis of the object of call. This feature beyond the foray of the procedural language such as C and is the lineament of the object-oriented programming language. The late binding is also known a dynamic binding or runtime binding. When a language implements this feature there must be some mechanism to ascertain object type that associates with appropriate member function. In compiled languages like C++, the compiler has little information about the correct function call unless it first identifies the object. So, there must be some way to identify the object first. This technique however varies from language to language.

The Problem with Static Binding: An Example

The problem with static binding is evident when running the following code below. The function makeSound accepts a reference of the Animal class. But, as we send a Cow object to the makeSound function from main without and cast, the output is not exactly what we want.

#include <iostream>
#include <string>
using namespace
std;

class Animal
{
public:
   void sound() {cout<<"What animal sound? "<<endl;}
};

class Cow: public Animal
{
public:
   void sound() {cout<<"moo. "<<endl;}
};

class Dog: public Animal
{
public:
   void sound() {cout<<"bark. "<<endl;}
};

void makeSound(Animal &a){
   a.sound();
}

int main()
{
   Cow cow;
   makeSound(cow);
   return 0;
}

The output is Animal::sound rather than Cow::sound, which we do not expect. Note that the object we are sending to the makeSound function is a Cow object and not just an Animal object. Although the Cow class is derived from the Animal class, the call should produce Cow::sound because there is a specific version of the function sound in the Cow object. But, the result is clearly not so. In this type of situation, we need late binding.

Using a Virtual Function

In C++, late binding is achieved by inserting a virtual keyword preceding the declaration of the function in the base class. This informs the compiler that this function is designated for late binding. A function declared virtual in the base class remains virtual all through its derived classes. If the derived class wants to redefine it, it may do so by overriding it. In a derived class, any function that matches the signature of the virtual base class function is called by using the virtual mechanism. The derived class may add the virtual keyword again while overriding it, but it is not strictly necessary to do so because the function is marked for late binding at its origin and its derivation carries the legacy.

Now, let’s modify the preceding code. Add the virtual keyword in the appropriate place, and observe the output.

#include <iostream>
#include <string>
using namespace std;

class Animal
{
public:
   virtual void sound() {cout<<"What animal
      sound? "<<endl;}
};

class Cow: public Animal
{
public:
   void sound() {cout<<"moo. "<<endl;}
};

class Dog: public Animal
{
public:
   void sound() {cout<<"bark. "<<endl;}
};

void makeSound(Animal &a){
   a.sound();
}

int main()
{
   Cow cow;
   Dog dog;
   makeSound(cow);
   makeSound(dog);
   return 0;
}

Notice that the code is exactly the same except that the virtual keyword is added to Animal::sound. But, the behavior is significantly different. The appropriate sound method call is determined by the type of object sent to the makeSound function.

Late Binding in C++

While imbibing the necessary mechanism of late binding into the code, the compiler does a lot of work behind the scenes. The compiler initiates late binding to the compiler by using the virtual keyword, which tells it to refrain from binding the function call to the function definition until runtime. This necessarily means that if we invoke the sound() function for a Cow object through an address of the base class Animal, Cow::sound will be called and not Animal::sound.

Internally, this is accomplished by creating a virtual table (VTABLE) for each class that contains a virtual function. The compiler puts the address of the virtual function in the specific VTABLE of that class. The class that contains the virtual function holds a tacit pointer called virtual pointer (VPTR). This pointer directly points to the VTABLE for the object. Now, when we make a call to the virtual function through the base class pointer in the code, the compiler does not typically do an assembly language CALL to a particular address, but generates different code to perform the function call. This code initializes VPTR to the starting address of the appropriate VTABLE. Therefore, on the virtual function call, VPTR finds the exact function address from the VTABLE and refers to the appropriate function. Because the appropriate function call to its definition is ascertained only at runtime, the binding happens late. This is the reason it is called late binding.

An example of late binding
Figure 1: An example of late binding

The compiler does all the VTABLE setup, VPTR initialization, and inserting the appropriate code to connect VPTR to VTABLE implicitly, without giving any hint to the programmer. The programmer can just be happy that it all works and need not bother.

Storing Type Information

There is no way to store type information in a class. But, implementing the virtual mechanism needs type information. Where does it get one? The compiler puts the type information in the class in some way and it is hidden. Otherwise, the type cannot be determined at runtime. This is quite evident if we compare the size of the class that has a virtual function with the class that does not have any virtual function. The sizes are different.

#include <iostream>
#include <string>
using namespace std;

class Animal
{
   int a;
public:
   virtual void sound() {cout<<"What animal
      sound? "<<endl;}
};

class Animal2
{
   int a;
public:
   void sound() {cout<<"What animal sound? "<<endl;}
};


int main()
{
   cout<<sizeof (Animal)<<endl;
   cout<<sizeof (Animal2)<<endl;
   return 0;]
}

When the class does not have a virtual function, the size is of the data size int defined in the class in the above case. However, in the case of a class that contains a virtual function, the size is the data member size plus the VPTR size. The classes that contain a virtual function are thus a bit heavier than the classes that do not contain any virtual function.

Conclusion

This article explained the concept of late binding in as simple terms as possible. Late binding is binding the function definition to the function call at execution time rather than compile time. The only way to accomplish this in C++ is by using the virtual keyword. The compiler creates a virtual table and a virtual pointer implicitly to realize the mechanism of dynamic binding. Late binding is one of the significant characteristics of object-oriented programming languages.

Manoj Debnath
Manoj Debnath
A teacher(Professor), actively involved in publishing, research and programming for almost two decades. Authored several articles for reputed sites like CodeGuru, Developer, DevX, Database Journal etc. Some of his research interest lies in the area of programming languages, database, compiler, web/enterprise development etc.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read