C++ Programming: Stack Allocators for STL Containers

Introduction

This article describes how to create stack based allocators for use with STL containers.

The technique requires the use of the <type_traits> header.

The STL is an essential part of any C++ coders toolkit, but there are aspects that can make it
hard to use in time critical applications. STL containers are safer to use in place of manually
allocated memory as they can automatically grow as data is added to them and will release resources
when destructed. The downside for real-time developers is that this flexibilty is acheived by dynamically
allocating from the heap; a bit no-no when trying to achieve deterministic response. This is especially
problematic when using them as local variables within a function

The usual work arounds include:

  • Use plain old arrays
  • Ensuring that containers are created at startup and have the desired capacity reserved
  • Declaring the containers as static
  • Creating a custom memory pool

None of these solutions really solve the problem of using STL containers as local variables within functions.
Creating at startup requires that several functions will have to use a common workspace.

This creates coupling between functions and cannot be used in a multi-threaded application.
The same goes for static variables. Reserving capacity only works for a couple of container types.
A custom memory pool would work, but requires that the allocation algorithm is efficient
and that allocators are supplied one per thread so as to avoid the use of locks.The use of plain old arrays means that you miss out on all the useful things that drew you to using the STL in the first place.

So what can you do?

The solution

Allocation from the stack has a very low overhead and this would seem to be the obvious
place for fast memory allocation.

The creators of the STL were very forward thinking when they designed the containers
and the definitions allow the allocation of resources from sources other than the default.

Take, for example, the definition of std::vector.

template <class Type, class Allocator = allocator <Type>>
class vector

The template definition shows that an allocator other than the default can be supplied
to the container. This also applies to all of the other containers in the STL.

How do you create a custom allocator?

There is a defined set of functions that an allocator class must provide.

pointer address(reference x) const
const_pointer address(const_reference x) const
pointer allocate(size_type n, const_pointer cp = 0)
void deallocate(pointer p, size_type n)
size_type max_size() const
void construct(pointer p, const value_type &x)
void destroy(pointer p)

pointer, const_pointer, const_reference, reference and size_type are typedef’d within the allocator class.

address Returns the address of the supplied object.
allocate Allocates the memory for ‘n’ objects. Pointer cp is ignored.
deallocate Releases the previously allocated resource.
max_size Returns the maximum size that the container may grow to.
construct Constructs an instance of the object.
destroy Releases the resources owned by the object pointed to.

The allocator must also provide a ‘rebind’ structure to allow the allocator to be applied to
internal container objects.

The easiest way to allocate memory on the stack is to declare an array, and this is the technique that the following allocators use.

The basic fixed allocator

This allocator will work for most STL containers, although std::string and std::vector benefit from
a simplified model.

The fixed allocator works by declaring an array of char that is a multiple of the object size
plus whatever extra is required to ensure correct alignment. Any pointer returned will be correctly aligned for the object type.
It is basically a memory pool local to the allocator.

When a container requests memory from the allocator it does so by asking for ‘n’ objects.
The allocate function will scan the internal array and try to find ‘n’ contiguous free elements.
If this cannot be done then a std::bad_alloc is thrown. The algorithm works using ‘first fit’ rather than ‘best fit’.
This may not be the most efficient in terms of use of space in the buffer, but is fairly fast in operation.
Searches always begin from the first free item.

When objects are destroyed their slot is marked as free again.

An example of use would be…

#include <set>
#include "fixed_allocator.h"

// Create a set of int with a capacity of 100 elements.
std::set<int, fixed_allocator<int, 100>>

The code for fixed allocator

template <typename T, const size_t MAX_SIZE>
class fixed_allocator
{
private:

  static const bool FREE   = false;
  static const bool IN_USE = true;

public:

  typedef T                  value_type;
  typedef value_type *       pointer;
  typedef const value_type * const_pointer;
  typedef value_type &       reference;
  typedef const value_type &  const_reference;
  typedef std::size_t        size_type;
  typedef std::ptrdiff_t     difference_type;

  //*********************************************************************
  // rebind
  //*********************************************************************
  template<typename U>
  struct rebind
  {
      typedef fixed_allocator<U, MAX_SIZE>
      other;
  };

  //*********************************************************************
  // Constructor
  //*********************************************************************
  fixed_allocator()
    : p_first_free(in_use)
  {
    initialise();
  }

  //*********************************************************************
  // Copy constructor
  //*********************************************************************
  fixed_allocator(const fixed_allocator &rhs)
    : p_first_free(in_use)
  {
    initialise();
  }

  //*********************************************************************
  // Templated copy constructor
  //*********************************************************************
  template<typename U>
  fixed_allocator(const fixed_allocator<U, MAX_SIZE>&rhs)
      : p_first_free(in_use)
  {
    initialise();
  }

  //*********************************************************************
  // Destructor
  //*********************************************************************
  ~fixed_allocator()
  {
  }

  //*********************************************************************
  // address
  //*********************************************************************
  pointer address(reference x) const
  {
    return (&x);
  }

  //*********************************************************************
  // address
  //*********************************************************************
  const_pointer address(const_reference x) const
  {
    return (x);
  }

  //*********************************************************************
  // allocate
  // Allocates from the internal array.
  //*********************************************************************
  pointer allocate(size_type n, const_pointer cp = 0)
  {
    // Pointers to the 'in_use' flags.
    bool *p_first     = p_first_free;
    bool *const p_end = &in_use[MAX_SIZE];

    // 'Find first fit' allocation algorithm, starting from the first free slot.
    // If n == 1 then we already have the free slot address or p_end.
     if (n == 1)
    {
      // No space left?
      if (p_first == p_end)
      {
        throw std::bad_alloc();
      }

      // Mark the slot as 'in use'
      *p_first = IN_USE;
    }
    else
    {
      // Search for a big enough range of free slots.
      p_first = std::search_n(p_first, p_end, static_cast<long>(n), FREE);

      // Not enough space found?
      if (p_first == p_end)
      {
        throw std::bad_alloc();
      }

      // Mark the range as 'in use'
      std::fill(p_first, p_first + n, IN_USE);
    }

    // Update the 'first free' pointer if necessary.
    if (p_first == p_first_free)
    {
      // Find the next free slot or p_end
      p_first_free = std::find(p_first + n, p_end, FREE);
    }

    // Return the memory allocation.
    const size_t offset = std::distance(in_use, p_first) * sizeof(value_type);

    return (reinterpret_cast<pointer>(&p_buffer[offset]));
  }

  //*********************************************************************
  // deallocate
  // Clears the 'in_use' flags for the deallocated items.
  //*********************************************************************
  void deallocate(pointer p, size_type n)
  {
    // Find the start of the range.
    size_t index = std::distance(p_buffer, reinterpret_cast<char *>(p)) / sizeof(value_type);

    bool *p_first = &in_use[index];

    // Mark the range as 'free'.
    if (n == 1)
    {
      *p_first = FREE;
    }
    else
    {
      std::fill(p_first, p_first + n, FREE);
    }

    // Update the 'first free' pointer if necessary.
    if (p_first < p_first_free)
    {
      p_first_free = p_first;
    }
  }

  //*********************************************************************
  // max_size
  // Returns the maximum size that can be allocated in total.
  //*********************************************************************
  size_type max_size() const
  {
      return (MAX_SIZE);
  }

  //*********************************************************************
  // construct
  // Constructs an item.
  //*********************************************************************
  void construct(pointer p, const value_type &x)
  {
    // Placement 'new'
    new (p)value_type(x);
  }

  //*********************************************************************
  // destroy
  // Destroys an item.
  //*********************************************************************
  void destroy(pointer p)
  {
    // Call the destructor.
    p->~value_type();
  }

private:

  enum
  {
    ALIGNMENT = std::tr1::alignment_of<T>::value - 1
  };

  //*********************************************************************
  // initialise
  // Initialises the internal allocation buffers.
  //*********************************************************************
  void initialise()
  {
    // Ensure alignment.
    p_buffer = reinterpret_cast<char *>((reinterpret_cast<size_t>(&buffer[0]) + ALIGNMENT) & ~ALIGNMENT);

    // Mark all slots as free.
    std::fill(in_use, in_use + MAX_SIZE, FREE);
  }

  // Disabled operator.
  void operator =(const fixed_allocator &);

  // The allocation buffer. Ensure enough space for correct alignment.
  char buffer[(MAX_SIZE * sizeof(value_type)) + ALIGNMENT + 1];

  // Pointer to the first valid location in the buffer after alignment.
  char *p_buffer;

  // The flags that indicate which slots are in use.
  bool in_use[MAX_SIZE];

  // Pointer to the first free slot.
  bool *p_first_free;
};

//*********************************************************************
// operator ==
// Equality operator.
//*********************************************************************
template<typename T, const size_t MAX_SIZE>
inline bool operator ==(const fixed_allocator<T, MAX_SIZE> &,
                        const fixed_allocator<T, MAX_SIZE> &)
{
  return (false);
}

//*********************************************************************
// operator !=
// Inequality operator.
//*********************************************************************
template<typename T, const size_t MAX_SIZE>
inline bool operator !=(const fixed_allocator<T, MAX_SIZE> &,
                        const fixed_allocator<T, MAX_SIZE> &)
{
  return (true);
}

The equality operator always returns false as, unlike standard allocators, they are never equivalent and one allocator cannot destroy the resources allocated by another.

Most modern STL implementations will check this and take the appropraite action.

Fixed block allocator

In the case of std::vector & std::string the previous allocator is not a particularly good choice.
They actually benefit from allocators that understand that these container’s elements are best stored contiguously.

The block allocator described below also uses an array as its memory pool, but does not need to search for free blocks.
The implementation will either use a single array or swap between two. This is entirely dependent on whether the type stored has a trivial destructor or not.

If a type does not have a trivial destructor then an increase in size above the capacity will require the existing elements
to be copied to the alternate array before the destructors are called. Unfortunately there appears to be no way round this as the behaviour is built into the containers, apart from calling ‘reserve’ with the maximum size. This ensures that no change in capacity will occur.

Types with trivial destructors may use a single array.

The code for fixed_block_allocator

template <typename T, const size_t MAX_SIZE>
class fixed_block_allocator
{
public:
typedef T                  value_type;
  typedef value_type *       pointer;
  typedef const value_type * const_pointer;
  typedef value_type &       reference;
  typedef const value_type & const_reference;
  typedef std::size_t        size_type;
  typedef std::ptrdiff_t     difference_type;

  enum
  {
    NUMBER_OF_BUFFERS = std::tr1::has_trivial_destructor<T>::value ? 1 : 2 // The numbers of buffers to use. Varies according to the type.
  };

  //*********************************************************************
  // rebind
  //*********************************************************************
  template<typename U>
  struct rebind
  {
      typedef fixed_block_allocator<U, MAX_SIZE> other;
  };

  //*********************************************************************
  // Constructor
  //*********************************************************************
  fixed_block_allocator()
      : buffer_id(0)
  {
      initialise();
  }

  //*********************************************************************
  // Copy constructor
  //*********************************************************************
  fixed_block_allocator(const fixed_block_allocator &rhs)
      : buffer_id(0)
  {
      initialise();
  }

  //*********************************************************************
  // Templated copy constructor
  //*********************************************************************
  template<typename U>
  fixed_block_allocator(const fixed_block_allocator<U, MAX_SIZE> &rhs)
      : buffer_id(0)
  {
      initialise();
  }

  //*********************************************************************
  // Destructor
  //*********************************************************************
  ~fixed_block_allocator()
  {
  }

  //*********************************************************************
  // address
  //*********************************************************************
  pointer address(reference x) const
  {
      return (&x);
  }

  //*********************************************************************
  // address
  //*********************************************************************
  const_pointer address(const_reference x) const
  {
      return (x);
  }

  //*********************************************************************
  // allocate
  // Allocates from the internal array.
  // If storage cannot be allocated then std::bad_alloc() is thrown.
  //*********************************************************************
  pointer allocate(size_type     n,
                   const_pointer cp = 0)
  {
      // Just too big?
      if (n > MAX_SIZE)
      {
          throw std::bad_alloc();
      }

      // Get the next buffer.
      buffer_id = ++buffer_id % NUMBER_OF_BUFFERS;

      // Always return the beginning of the buffer.
      return (reinterpret_cast<pointer>(p_buffer[buffer_id]));
  }

  //*********************************************************************
  // deallocate
  // Does nothing.
  //*********************************************************************
  void deallocate(pointer   p,
                  size_type n)
  {
  }

  //*********************************************************************
  // max_size
  // Returns the maximum size that can be allocated in total.
  //*********************************************************************
  size_type max_size() const
  {
      return (MAX_SIZE);
  }

  //*********************************************************************
  // construct
  // Constructs an item.
  //*********************************************************************
  void construct(pointer          p,
                 const value_type &x)
  {
      new (p)value_type(x);
  }

  //*********************************************************************
  // destroy
  // Destroys an item.
  //*********************************************************************
  void destroy(pointer p)
  {
      p->~value_type();
  }

private:

  enum
  {
      ALIGNMENT = std::tr1::alignment_of<T>::value - 1 // The alignment of the buffers - 1
  };

  //*********************************************************************
  // initialise
  // Initialises the internal allocation buffers.
  //*********************************************************************
  void initialise()
  {
      // Ensure alignment.
      for (int i = 0; i < NUMBER_OF_BUFFERS; ++i)
      {
          p_buffer[i] = reinterpret_cast<char *>((reinterpret_cast<size_t>(&buffer[i][0]) + ALIGNMENT) & ~ALIGNMENT);
      }
  }

  // Disabled operator.
  void operator =(const fixed_block_allocator &);

  // The allocation buffers. Ensure enough space for correct alignment.
  char buffer[NUMBER_OF_BUFFERS][(MAX_SIZE * sizeof(value_type)) + ALIGNMENT + 1];

  // Pointers to the first valid locations in the buffers after alignment.
  char *p_buffer[NUMBER_OF_BUFFERS];

  // The index of the currently allocated buffer.
  int buffer_id;
};

//*********************************************************************
// operator ==
// Equality operator.
//*********************************************************************
template<typename T, const size_t MAX_SIZE>
inline bool operator ==(const fixed_block_allocator<T, MAX_SIZE> &,
                        const fixed_block_allocator<T, MAX_SIZE> &)
{
    return (false);
}

//*********************************************************************
// operator !=
// Inequality operator.
//*********************************************************************
template<typename T, const size_t MAX_SIZE>
inline bool operator !=(const fixed_block_allocator<T, MAX_SIZE> &,
                        const fixed_block_allocator<T, MAX_SIZE> &)
{
    return (true);
}

Caveats

Yes, there are some downsides, but not too many.

  • Some containers will try to allocate more elements than you may expect.
    std::string will probably require one extra for a terminating zero for use when c_str() is called.
    std::deque will allocate a fixed number of blocks every time its capacity is increased. Your max size may need to be a multiple of this.
    Only experiment will tell what the situation is for your STL implementation.
  • std::swap will always involve a copy as there are no handy pointers to exchange.
  • Move semantics (rvalue references) will also not apply for the same reason.
  • If non-equivalent allocators are not supported in your STL implementation then std::list’s splice will not work.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read