An Introduction to Object Serialization in C++

Overview

Serialization is a mechanism to convert an object into a sequence of bytes so that it can be stored in memory. The byte stream, once created, also can be streamed across a communication link to a remote receiving end. The reverse of serialization is called deserialization, where the data in the byte stream is used to reconstruct it to its original object form. If this is already confusing you, consider visiting the TechRepublic Academy! There are dozens of classes on C++ from beginner, to experienced. Apart from object persistence, this mechanism is particularly useful in transmitting object information in serialized form, say, to the server which, on receiving it, can deserialize and create the object format which is its original form.

Storing Objects in C++

In C++, the traditional object-oriented style of input and output operates on the fundamental types rather than objects. We, however, can leverage the mechanism of operator overloading to actually work with the input and output of objects of user defined types. For example, the operator such as ‘>>’ can be overloaded as a stream extraction operator for the appropriate istream. And, in the same way, we can overload the ‘<<‘ operator for the stream insertion for the appropriate ostream. Note that, in all these cases, we can do an input/output of data members and not of objects in its entirety. Although it apparently may seem to be an IO operation on the object which can be duly used for persistence as well, understand that there only the data members of the object are actually persisted. The class member functions are simply ignored. The distinctiveness of objects is not maintained because there is only one copy of class member functions which remains available internally, to be shared among all the objects of this particular class.

Storing Object Data vs Serialization

In simple object data persistence, we lose vital information about the object and its type. We typically store an object’s attributes and not any other detail, such as its type. As a result, a program that reads the object back from the disk will have no information about the type and cannot reconstruct the object, or has a roundabout way to recreate the object from the information it gets. What if we store objects of different types in the same file? Then, is it possible to find which attribute data corresponds to which object without any type information stored? This is the reason storing or persisting an object’s data is not same as object serialization.

Object Serialization

Although we also persist object in serialization, they are not same as the feat we accomplish by overloading extraction and insertion operators. The name ‘serialized object’ means that the object is represented as a sequence of bytes. The information in the byte sequence includes object’s attribute data, its types, as well as the object’s type. Once this serialized object information is written to a file, it can be read back, deserialized, and reconstruct the object in memory at any time.

The C++ Standard Library is not equipped with the built-in mechanism to serialize or deserialize an object. However, we can use a third-party library and there are many. One of such third-party open source C++ library is Boost. The Boost C++ Library provides support for serializing objects not only in text but also binary and extensible markup language (XML). Object serialization also is supported by the Qt C++ library as well. Here, we’ll take up these two libraries with a simple example to show how to serialize objects in C++.

Object Serialization with the Boost C++ Library

Boost serialization works in two ways: one is called the intrusive method and another is the non-intrusive method. The intrusive method is the simplest one where we build mechanism of serialization in the object itself.

#include <fstream>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>

using namespace std;

class Employee {
private:
   friend class boost::serialization::access;
   int id;
   string name;
   float salary;
   template<class Archive>
   void serialize(Archive &a, const unsigned version){
      a & id & name & salary;
   }
public:
   Employee(){}
   Employee(int i, string n, float s):id(i),name(n),salary(s)
   {}
};

int main()
{
   const string filename = "emp.dat";
   Employee e1(11,"Harry",4500.00f);
   Employee e2(22,"Ravi",8800.00f);
   Employee e3(33,"Tim",6800.00f);
   Employee e4(44,"Rajiv",3400.00f);

   // Serialize and persist the object
   {
      std::ofstream outfile(filename);
      boost::archive::text_oarchive archive(outfile);
      archive << e1 << e2 << e3 << e4;
   }

   // Deserialize and restore the object
   Employee restored_e1;
   Employee restored_e2;
   Employee restored_e3;
   Employee restored_e4;

   {
      std::ifstream infile(filename);
      boost::archive::text_iarchive archive(infile);
      archive >> restored_e1 >> restored_e2
         >> restored_e3 >> restored_e4;
   }
   return 0;
}

Note that, for each class to be serialized, there is a template function that saves the whole class definition. The classes should be restored in the same sequence they were stored. This is particularly important if different objects are stored in the same file.

In the case of non-intrusive serialization, the class definition remains unaltered and serialization is imposed externally.

#include <fstream>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>

using namespace std;

class Employee {
public:
   int id;
   string name;
   float salary;
   Employee(){}
   Employee(int i, string n, float s):id(i),name(n),salary(s)
   {}
};

namespace boost {
   namespace serialization {
      template<class Archive>
      void serialize(Archive &a, Employee &e,
               const unsigned version){
         a & e.id & e.name & e.salary;
      }
   }
}

Note that, in the case of non-intrusive serialization, the serialization function is not part of the serializable class. Both intrusive and non-intrusive serialization operate in the same way; only the approach is different.

Refer to the Boost C++ Library Serialization Documentation for more detail on this.

Object Serialization with QDataStream

The QDataStream class of the Qt C++ Library enables us to serialize objects into a sequence of binary data to any QIODevice, such as  socket, file, port, and so forth. It encodes information in a platform-independent format without having to worry about byte order, OS, or the underlying hardware. This class implements the serialization of C++ primitive types and some of the other Qt types.

Here is a quick example.

#include <QCoreApplication>
#include <QFile>
#include <QDebug>
#include <QDataStream>

void save(){
   QMap<int, QString> emp;
   emp.insert(11,"Harry");
   emp.insert(22,"Ron");
   emp.insert(33,"Rajiv");
   emp.insert(44,"Kate");

   QFile file("emp.dat");
   if(!file.open(QIODevice::WriteOnly)){
      qDebug()<<"Error! Cannot open file.";
      return;
   }
   QDataStream outStream(&file);
   outStream.setVersion(QDataStream::Qt_5_13);

   outStream << emp;

   file.flush();
   file.close();
}

void restore(){
   QMap<int, QString> emp;
   QFile file("/home/mano/emp.dat");
   if(!file.open(QIODevice::ReadOnly)){
      qDebug()<<"Error! Cannot open file.";
      return;
   }
   QDataStream inStream(&file);
   inStream.setVersion(QDataStream::Qt_5_13);
   inStream >> emp;

   QMapIterator<int, QString> iter(emp);
   while (iter.hasNext()) {
      iter.next();
      qDebug() << iter.key() << ": " << iter.value() << endl;
   }

   file.flush();
   file.close();
}

int main(int argc, char *argv[])
{
   QCoreApplication a(argc, argv);
   save();
   restore();

   return a.exec();
}

Conclusion

The article showed you an introductory idea of object serialization in C++ and how it differs from simple object storage. Implementing serializing poses more challenges than one might think. In implementing them, one has to keep in mind how can we do it generically or what is the possible solution to acknowledge data length so that reconstruction of object does not create problems during deserialization. These vexing questions are not encountered in simple data persistence. Therefore, the mechanism of serialization is not that simple. However, there is no dearth of implementation to complement C++ in many of its third-party libraries.

Manoj Debnath
Manoj Debnath
A teacher(Professor), actively involved in publishing, research and programming for almost two decades. Authored several articles for reputed sites like CodeGuru, Developer, DevX, Database Journal etc. Some of his research interest lies in the area of programming languages, database, compiler, web/enterprise development etc.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read