Using "Thin Templates" to Reduce Application Size

Using "Thin Templates" to Reduce Application Size

A template itself is a very useful C++ concept that allows you to use a single declaration for a number of classes or functions with the same logic but with different types used. The primary usages of templates are type-neutral containers such as "list," "map," "queue," and so on. Once coded, their logic can be applied to any object beginning from simple a "int" to the complex classes. However, templates have one big drawback: They produce large-size executables. Because each declaration of a template will produce a brand-new class or function, the original template code is multiplied numerous times across the application, producing a size boost. Of course, a simple 100-lines-size "list" container used for "int" and "char *" will increase the application size a mere 100-200 bytes, but a complex container used across an application for dozens of different objects can increase an application's size by many kilobytes.

If it's preferable to keep application size small, a handy technique called "Thin Templates" can be used to nearly eliminate the templates' size-boost effect. This technique is based on the fact that most optimized compilers will generate different code only for template functions that actually use different types. For example, if the following template class is defined:

template <class T> class CTest
{
   int Test1(void)
   {
      return 1;
   }

   T Test2(void)
   {
      return 1;
   }
}

And then used two times with different types:

CTest<char> oClass1;
CTest<int>  oClass2;

Only one code will be generated for function Test1(), because this function doesn't use template argument T and contains the same code in both versions. The compiler's ability to generate one code for the template with different types is the first and only requirement for successful use of "thin templates." Today's C++ compilers, such as "Microsoft C++ compiler" or "GCC," have such ability.

The main idea of "thin templates" is to move all code that doesn't use template arguments into separate functions so only one instance of code will be generated for them. Let's see how it works. For example, we will use a very simple list class:

template<class T> class MyList
{
public:

   explicit MyList()
   {
       m_poFirst=0;
       m_poLast=0;
   }

   ~MyList()
   {
      for(SItem *poCur=m_poFirst; poCur;)
      {
         SItem *poNext=poCur->m_poNext;
         delete poCur;
         poCur=poNext;
      }
   }

   MyList(const MyList &);         // must not be used
   MyList &operator=(MyList &);    // must not be used

   class iterator
   {
   public:
      iterator()
      {
         m_poCur=0;
      }
   private:
      friend class MyList<T>;
      void *m_poCur;
   };

   bool Add(const T &a_oSrc)
   {
      SItem *poNewItem=new SItem();
      if(!poNewItem)  return false;
      poNewItem->m_poPrev=m_poLast ? m_poLast : 0;
      poNewItem->m_poNext=0;
      poNewItem->m_oData=a_oSrc;
      if(m_poLast) m_poLast->m_poNext=poNewItem;
      m_poLast=poNewItem;
      if(!m_poFirst) m_poFirst=poNewItem;
      return true;
   }

   T *GetFirst(MyList<T>::iterator &a_oIt)
   {
      a_oIt.m_poCur=(void *)m_poFirst;
      return a_oIt.m_poCur ? &(m_poFirst->m_oData) : 0;
   }

   T *GetNext(MyList<T>::iterator &a_oIt)
   {
      if(!a_oIt.m_poCur) return 0;
      SItem *poNext=((SItem *)a_oIt.m_poCur)->m_poNext;
      if(poNext) a_oIt.m_poCur=poNext;
      return poNext ? &(poNext->m_oData) : 0;
   }

   bool Remove(MyList<T>::iterator &a_oIt)
   {
      SItem *poCur=(SItem *)a_oIt.m_poCur;
      if(!poCur) return false;
      if(poCur->m_poPrev) poCur->m_poPrev->m_poNext=poCur->m_poNext;
      else m_poFirst=poCur->m_poNext;
      if(poCur->m_poNext) poCur->m_poNext->m_poPrev=poCur->m_poPrev;
      else m_poLast=poCur->m_poPrev;
      delete poCur;
      a_oIt.m_poCur=0;
      return true;
   }

private:

   struct SItem
   {
      T m_oData;
      SItem *m_poNext,
            *m_poPrev;
   } *m_poFirst,
     *m_poLast;
};

As you can see, nearly all of the functions in this simple "list" container use template argument T. But, if you look closer, you will find that this argument is used in one or two lines of code in each function. So, most of the functions can be divided to separate template-dependent code from template-independent code. For example, let's divide the Add() function that adds an object into the list and calculates sizes in bytes of template-independent and template-dependent code in both cases:

Before using "thin templates:"

struct SItem
{
   T m_oData;
   SItem *m_poNext,
         *m_poPrev;
}  *m_poFirst,
   *m_poLast;

bool Add(const T &a_oSrc)
{                                                // prolog, 9 bytes
   SItem *poNewItem=new SItem();                 // 17 bytes
   if(!poNewItem)  return false;                 // 10 bytes
   poNewItem->m_poPrev=m_poLast ? m_poLast : 0;  // 33 bytes
   poNewItem->m_poNext=0;                        // 7 bytes
   poNewItem->m_oData=a_oSrc;                    // 10 bytes
   if(m_poLast) m_poLast->m_poNext=poNewItem;    // 21 bytes
   m_poLast=poNewItem;                           // 6 bytes
   if(!m_poFirst) m_poFirst=poNewItem;           // 18 bytes
   return true;                                  // 2 bytes
}                                                // epilog, 4 bytes

The total function size is 137 bytes and all code is template-dependent. If we use only one template, this function will take 137 bytes of code. Two templates with different templates arguments will require 2*137=274 bytes of code.

After using "thin templates:"

struct SThinItem
{
   SThinItem *m_poNext,
             *m_poPrev;
} *m_poFirst,
  *m_poLast;

struct SItem: public SThinItem
{
   T m_oData;
};

void ThinAdd(SThinItem *a_poDst)
{                                               // prolog, 8 bytes
   a_poDst->m_poPrev=m_poLast ? m_poLast : 0;   // 33 bytes
   a_poDst->m_poNext=0;                         // 6 bytes
   if(m_poLast) m_poLast->m_poNext=a_poDst;     // 20 bytes
   m_poLast=a_poDst;                            // 9 bytes
   if(!m_poFirst) m_poFirst=a_poDst;            // 18 bytes
}                                               // epilog, 4 bytes

bool Add(const T &a_oSrc)
{                                               // prolog, 9 bytes
   SItem *poNewItem=new SItem();                // 17 bytes
   if(!poNewItem)  return false;                // 10 bytes
   poNewItem->m_oData=a_oSrc;                   // 11 bytes
   ThinAdd(poNewItem);                          // 11 bytes
   return true;                                 // epilog, 4 bytes
}

The total function size is 160 bytes; that seems to be 23 bytes larger than our original function! But, the magic is that the template-dependent part, Add(), is only 62 bytes long, and main template-independent code moved to ThinAdd() is 98 bytes long. Additional bytes are due to the second function's prolog and epilog, and actual function call. If we add a second template with different template arguments, this will increase the total code size only on 62 bytes, and total amount of code will be 222 bytes long. And, each new template will increase the total code size by only 62 bytes, instead of 137.

So, the "thin templates" method is useful if the template class is used more than once with different template arguments. There are different implementations of "thin templates." The simplest implementation was demonstrated in the example above. Code is divided into template-dependent and template-independent sections by adding new functions for template-independent code. A better practise is to use a parent class for template-independent structures and code, and inhenerit a template class from it, adding template-dependent structures and code. For example, I will completely transform our simple template "list" so it will use the "thin template" technique:

class CListThin
{
public:

   explicit CListThin()
   {
      m_poFirst=0;
      m_poLast=0;
   }

   virtual ~CListThin()
   {
      for(SThinItem *poCur=m_poFirst; poCur;)
      {
         SThinItem *poNext=poCur->m_poNext;
         delete poCur;
         poCur=poNext;
      }
   }

   struct SThinItem
   {
      SThinItem *m_poNext,    // pointer to next list item
                *m_poPrev;    // pointer to previous list item
   };

   class iterator
   {
   public:
      iterator()
      {
         m_poCur=0;
      }
      void *m_poCur;
   };

   void ThinAdd(SThinItem *a_poDst)
   {
      a_poDst->m_poPrev=m_poLast ? m_poLast : 0;
      a_poDst->m_poNext=0;
      if(m_poLast) m_poLast->m_poNext=a_poDst;
      m_poLast=a_poDst;
      if(!m_poFirst) m_poFirst=a_poDst;
   }

   bool Remove(CListThin::iterator &a_oIt)
   {
      SThinItem *poCur=(SThinItem *)a_oIt.m_poCur;
      if(!poCur) return false;
      if(poCur->m_poPrev) poCur->m_poPrev->m_poNext=poCur->m_poNext;
      else m_poFirst=poCur->m_poNext;
      if(poCur->m_poNext) poCur->m_poNext->m_poPrev=poCur->m_poPrev;
      else m_poLast=poCur->m_poPrev;
      delete poCur;
      a_oIt.m_poCur=0;
      return true;
   }

protected:

   SThinItem *m_poFirst,
             *m_poLast;
};

template<class T> class list: public CListThin
{
public:

   list()
   {
   }

   list(const eye::list &);              // must not be used
   eye::list &operator=(eye::list &);    // must not be used

   bool Add(const T &a_oSrc)
   {
      SItem *poNewItem=new SItem();
      if(!poNewItem)  return false;
      poNewItem->m_oData=a_oSrc;
      ThinAdd(poNewItem);
      return true;
   }

   T *GetFirst(eye::list<T>::iterator &a_oIt)
   {
      a_oIt.m_poCur=(void *)m_poFirst;
      return a_oIt.m_poCur ? &(((SItem *)m_poFirst)->m_oData) : 0;
   }

   T *GetNext(eye::list<T>::iterator &a_oIt)
   {
      if(!a_oIt.m_poCur) return 0;
      SItem *poNext=(SItem *)(((SItem *)a_oIt.m_poCur)->m_poNext);
      if(poNext) a_oIt.m_poCur=poNext;
      return poNext ? &(poNext->m_oData) : 0;
   }

private:

  struct SItem: public CListThin::SThinItem
  {               // list item description
    T m_oData;    // data, saved in list
  };

};

Of course, in the thin simple example, using "thin templates" will give only a few hundreds bytes' advantage if the template is used 2-3 times with different template arguments. But, in percents, it will be more then 50% code economy. If a project frequently uses complex template containers, the "thin templates" technique can greatly reduce executable size.

Note: Some compilers can be configured to automatically inline class member functions if they are called from class itself. This can eliminate "thin templates" effects. In the "Microsoft C++ compiler," you can use the /Ob0 command-line switch to disable inline function expansion or use functions with variable number of arguments (for example - void ThinAdd(SThinItem *a_poDst, ...) ); such functions can't be inlined.


About the Author

Gregor Petrov

Unfortunately, nothing is known about this strange person ;(

Comments

  • Interesting

    Posted by Mutilated1 on 10/07/2004 02:52pm

    A very interesting article. Good job.

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • On-demand Event Event Date: September 10, 2014 Modern mobile applications connect systems-of-engagement (mobile apps) with systems-of-record (traditional IT) to deliver new and innovative business value. But the lifecycle for development of mobile apps is also new and different. Emerging trends in mobile development call for faster delivery of incremental features, coupled with feedback from the users of the app "in the wild." This loop of continuous delivery and continuous feedback is how the best mobile …

  • Not all enterprise applications are created equal. Sophisticated applications need developer support but other more basic apps do not. With the right tools, everyone is a potential app developer with ideas and a perspective to share. Trends such as low-code development and model driven development are fundamentally changing how and who creates applications. Is your organization ready? Read this report and learn: The seven personas of enterprise app delivery How application ownership is spreading to the …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds