Using "Thin Templates" to Reduce Application Size

Using "Thin Templates" to Reduce Application Size

A template itself is a very useful C++ concept that allows you to use a single declaration for a number of classes or functions with the same logic but with different types used. The primary usages of templates are type-neutral containers such as "list," "map," "queue," and so on. Once coded, their logic can be applied to any object beginning from simple a "int" to the complex classes. However, templates have one big drawback: They produce large-size executables. Because each declaration of a template will produce a brand-new class or function, the original template code is multiplied numerous times across the application, producing a size boost. Of course, a simple 100-lines-size "list" container used for "int" and "char *" will increase the application size a mere 100-200 bytes, but a complex container used across an application for dozens of different objects can increase an application's size by many kilobytes.

If it's preferable to keep application size small, a handy technique called "Thin Templates" can be used to nearly eliminate the templates' size-boost effect. This technique is based on the fact that most optimized compilers will generate different code only for template functions that actually use different types. For example, if the following template class is defined:

template <class T> class CTest
{
   int Test1(void)
   {
      return 1;
   }

   T Test2(void)
   {
      return 1;
   }
}

And then used two times with different types:

CTest<char> oClass1;
CTest<int>  oClass2;

Only one code will be generated for function Test1(), because this function doesn't use template argument T and contains the same code in both versions. The compiler's ability to generate one code for the template with different types is the first and only requirement for successful use of "thin templates." Today's C++ compilers, such as "Microsoft C++ compiler" or "GCC," have such ability.

The main idea of "thin templates" is to move all code that doesn't use template arguments into separate functions so only one instance of code will be generated for them. Let's see how it works. For example, we will use a very simple list class:

template<class T> class MyList
{
public:

   explicit MyList()
   {
       m_poFirst=0;
       m_poLast=0;
   }

   ~MyList()
   {
      for(SItem *poCur=m_poFirst; poCur;)
      {
         SItem *poNext=poCur->m_poNext;
         delete poCur;
         poCur=poNext;
      }
   }

   MyList(const MyList &);         // must not be used
   MyList &operator=(MyList &);    // must not be used

   class iterator
   {
   public:
      iterator()
      {
         m_poCur=0;
      }
   private:
      friend class MyList<T>;
      void *m_poCur;
   };

   bool Add(const T &a_oSrc)
   {
      SItem *poNewItem=new SItem();
      if(!poNewItem)  return false;
      poNewItem->m_poPrev=m_poLast ? m_poLast : 0;
      poNewItem->m_poNext=0;
      poNewItem->m_oData=a_oSrc;
      if(m_poLast) m_poLast->m_poNext=poNewItem;
      m_poLast=poNewItem;
      if(!m_poFirst) m_poFirst=poNewItem;
      return true;
   }

   T *GetFirst(MyList<T>::iterator &a_oIt)
   {
      a_oIt.m_poCur=(void *)m_poFirst;
      return a_oIt.m_poCur ? &(m_poFirst->m_oData) : 0;
   }

   T *GetNext(MyList<T>::iterator &a_oIt)
   {
      if(!a_oIt.m_poCur) return 0;
      SItem *poNext=((SItem *)a_oIt.m_poCur)->m_poNext;
      if(poNext) a_oIt.m_poCur=poNext;
      return poNext ? &(poNext->m_oData) : 0;
   }

   bool Remove(MyList<T>::iterator &a_oIt)
   {
      SItem *poCur=(SItem *)a_oIt.m_poCur;
      if(!poCur) return false;
      if(poCur->m_poPrev) poCur->m_poPrev->m_poNext=poCur->m_poNext;
      else m_poFirst=poCur->m_poNext;
      if(poCur->m_poNext) poCur->m_poNext->m_poPrev=poCur->m_poPrev;
      else m_poLast=poCur->m_poPrev;
      delete poCur;
      a_oIt.m_poCur=0;
      return true;
   }

private:

   struct SItem
   {
      T m_oData;
      SItem *m_poNext,
            *m_poPrev;
   } *m_poFirst,
     *m_poLast;
};

As you can see, nearly all of the functions in this simple "list" container use template argument T. But, if you look closer, you will find that this argument is used in one or two lines of code in each function. So, most of the functions can be divided to separate template-dependent code from template-independent code. For example, let's divide the Add() function that adds an object into the list and calculates sizes in bytes of template-independent and template-dependent code in both cases:

Before using "thin templates:"

struct SItem
{
   T m_oData;
   SItem *m_poNext,
         *m_poPrev;
}  *m_poFirst,
   *m_poLast;

bool Add(const T &a_oSrc)
{                                                // prolog, 9 bytes
   SItem *poNewItem=new SItem();                 // 17 bytes
   if(!poNewItem)  return false;                 // 10 bytes
   poNewItem->m_poPrev=m_poLast ? m_poLast : 0;  // 33 bytes
   poNewItem->m_poNext=0;                        // 7 bytes
   poNewItem->m_oData=a_oSrc;                    // 10 bytes
   if(m_poLast) m_poLast->m_poNext=poNewItem;    // 21 bytes
   m_poLast=poNewItem;                           // 6 bytes
   if(!m_poFirst) m_poFirst=poNewItem;           // 18 bytes
   return true;                                  // 2 bytes
}                                                // epilog, 4 bytes

The total function size is 137 bytes and all code is template-dependent. If we use only one template, this function will take 137 bytes of code. Two templates with different templates arguments will require 2*137=274 bytes of code.

After using "thin templates:"

struct SThinItem
{
   SThinItem *m_poNext,
             *m_poPrev;
} *m_poFirst,
  *m_poLast;

struct SItem: public SThinItem
{
   T m_oData;
};

void ThinAdd(SThinItem *a_poDst)
{                                               // prolog, 8 bytes
   a_poDst->m_poPrev=m_poLast ? m_poLast : 0;   // 33 bytes
   a_poDst->m_poNext=0;                         // 6 bytes
   if(m_poLast) m_poLast->m_poNext=a_poDst;     // 20 bytes
   m_poLast=a_poDst;                            // 9 bytes
   if(!m_poFirst) m_poFirst=a_poDst;            // 18 bytes
}                                               // epilog, 4 bytes

bool Add(const T &a_oSrc)
{                                               // prolog, 9 bytes
   SItem *poNewItem=new SItem();                // 17 bytes
   if(!poNewItem)  return false;                // 10 bytes
   poNewItem->m_oData=a_oSrc;                   // 11 bytes
   ThinAdd(poNewItem);                          // 11 bytes
   return true;                                 // epilog, 4 bytes
}

The total function size is 160 bytes; that seems to be 23 bytes larger than our original function! But, the magic is that the template-dependent part, Add(), is only 62 bytes long, and main template-independent code moved to ThinAdd() is 98 bytes long. Additional bytes are due to the second function's prolog and epilog, and actual function call. If we add a second template with different template arguments, this will increase the total code size only on 62 bytes, and total amount of code will be 222 bytes long. And, each new template will increase the total code size by only 62 bytes, instead of 137.

So, the "thin templates" method is useful if the template class is used more than once with different template arguments. There are different implementations of "thin templates." The simplest implementation was demonstrated in the example above. Code is divided into template-dependent and template-independent sections by adding new functions for template-independent code. A better practise is to use a parent class for template-independent structures and code, and inhenerit a template class from it, adding template-dependent structures and code. For example, I will completely transform our simple template "list" so it will use the "thin template" technique:

class CListThin
{
public:

   explicit CListThin()
   {
      m_poFirst=0;
      m_poLast=0;
   }

   virtual ~CListThin()
   {
      for(SThinItem *poCur=m_poFirst; poCur;)
      {
         SThinItem *poNext=poCur->m_poNext;
         delete poCur;
         poCur=poNext;
      }
   }

   struct SThinItem
   {
      SThinItem *m_poNext,    // pointer to next list item
                *m_poPrev;    // pointer to previous list item
   };

   class iterator
   {
   public:
      iterator()
      {
         m_poCur=0;
      }
      void *m_poCur;
   };

   void ThinAdd(SThinItem *a_poDst)
   {
      a_poDst->m_poPrev=m_poLast ? m_poLast : 0;
      a_poDst->m_poNext=0;
      if(m_poLast) m_poLast->m_poNext=a_poDst;
      m_poLast=a_poDst;
      if(!m_poFirst) m_poFirst=a_poDst;
   }

   bool Remove(CListThin::iterator &a_oIt)
   {
      SThinItem *poCur=(SThinItem *)a_oIt.m_poCur;
      if(!poCur) return false;
      if(poCur->m_poPrev) poCur->m_poPrev->m_poNext=poCur->m_poNext;
      else m_poFirst=poCur->m_poNext;
      if(poCur->m_poNext) poCur->m_poNext->m_poPrev=poCur->m_poPrev;
      else m_poLast=poCur->m_poPrev;
      delete poCur;
      a_oIt.m_poCur=0;
      return true;
   }

protected:

   SThinItem *m_poFirst,
             *m_poLast;
};

template<class T> class list: public CListThin
{
public:

   list()
   {
   }

   list(const eye::list &);              // must not be used
   eye::list &operator=(eye::list &);    // must not be used

   bool Add(const T &a_oSrc)
   {
      SItem *poNewItem=new SItem();
      if(!poNewItem)  return false;
      poNewItem->m_oData=a_oSrc;
      ThinAdd(poNewItem);
      return true;
   }

   T *GetFirst(eye::list<T>::iterator &a_oIt)
   {
      a_oIt.m_poCur=(void *)m_poFirst;
      return a_oIt.m_poCur ? &(((SItem *)m_poFirst)->m_oData) : 0;
   }

   T *GetNext(eye::list<T>::iterator &a_oIt)
   {
      if(!a_oIt.m_poCur) return 0;
      SItem *poNext=(SItem *)(((SItem *)a_oIt.m_poCur)->m_poNext);
      if(poNext) a_oIt.m_poCur=poNext;
      return poNext ? &(poNext->m_oData) : 0;
   }

private:

  struct SItem: public CListThin::SThinItem
  {               // list item description
    T m_oData;    // data, saved in list
  };

};

Of course, in the thin simple example, using "thin templates" will give only a few hundreds bytes' advantage if the template is used 2-3 times with different template arguments. But, in percents, it will be more then 50% code economy. If a project frequently uses complex template containers, the "thin templates" technique can greatly reduce executable size.

Note: Some compilers can be configured to automatically inline class member functions if they are called from class itself. This can eliminate "thin templates" effects. In the "Microsoft C++ compiler," you can use the /Ob0 command-line switch to disable inline function expansion or use functions with variable number of arguments (for example - void ThinAdd(SThinItem *a_poDst, ...) ); such functions can't be inlined.


About the Author

Gregor Petrov

Unfortunately, nothing is known about this strange person ;(

Comments

  • Interesting

    Posted by Mutilated1 on 10/07/2004 02:52pm

    A very interesting article. Good job.

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Specialization and efficiency are always in need. Whether it's replacing an aging roof, getting a haircut, or tuning up a car, most seek the assistance of trusted experts. The same is true in the business world, where an increasing number of companies are seeking the help of others to administer their IT systems and services. This special edition of Unleashing IT highlights a new breed of IT caretaker -- Cisco Powered service providers -- and the business advantages and operational efficiencies they …

  • Java developers know that testing code changes can be a huge pain, and waiting for an application to redeploy after a code fix can take an eternity. Wouldn't it be great if you could see your code changes immediately, fine-tune, debug, explore and deploy code without waiting for ages? In this white paper, find out how that's possible with a Java plugin that drastically changes the way you develop, test and run Java applications. Discover the advantages of this plugin, and the changes you can expect to see …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds