Portability of the STL remove_if Algorithm (or The Lack of)

Environment: CPP, STL, Portable

The standard of C++ [ISO98] does not specify all the issues about writing portable code. There are some issues that are left to the implementation of the compiler and library. All of those issues should be handled carefully to make your code portable and this makes writing portable code challenging as well as fun.

The same situation may arise in case of using standard template library. The standards specify the specification of an algorithm but not the implementation detail of the algorithm, so it may be different on different platforms. One such situation may arise in the remove_if algorithm in the standard template library [JOS99].

Let's take a look at the first simple example of the usage of the remove_if algorithm. This algorithm requires two forward iterators and one predicate.

template<class _FwdIt, class _Pr> inline
_FwdIt remove_if(_FwdIt _First, _FwdIt _Last, _Pr _Pred)

According the Standard of C++, "It is unspecified whether any global function in the C++ Standard Library is defined as inline." So, it may or may not be inline on different platforms but this is not our immediate problem.

A predicate can be a function object or a function pointer. The behavior of the remove_if algorithm is slightly different between the function object and function pointer, although it shouldn't be. A function object can store the state in a member variable; however, a function pointer cannot. In case of a function pointer predicate, the function can maintain its state by taking a static variable to simulate the behavior of function object, but you can make two objects of the function object store two different states and it is not possible in case of simple functions. And this is the slight difference that may create different behaviors on different implementations.

Here is the example to show the usage of a function pointer as a predicate. The Predicate function is used to remove the third element in the vector.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

bool Remove3rd(int iVal)
{
  return 3 == iVal;
}

int main()
{
  vector<int> vec;

  for (int iIndex = 1; iIndex < 10; ++iIndex)
  {
    vec.push_back(iIndex);
  }

  vector<int>::iterator iter_ = remove_if(vec.begin(), vec.end(),
                                          Remove3rd);

  vec.erase(iter_, vec.end());

  copy(vec.begin(), vec.end(),
       ostream_iterator<int>(cout, " "));

  return 0;
}

The output of this program is:

1 2 4 5 6 7 8 9

This is the exactly same output, which we need. Now, try to make the same program with a function object as a predicate instead of a function pointer.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

template <typename T>
class Remove3rd
{
public:
  bool operator () (T iValue)
  {
    return 3 == iValue;
  }
};

int main()
{
  vector<int> vec;

  for (int iIndex = 1; iIndex < 10; ++iIndex)
  {
    vec.push_back(iIndex);
  }

  vector<int>::iterator iter_ = remove_if(vec.begin(), vec.end(),
                                          Remove3rd<int>());

  vec.erase(iter_, vec.end());

  copy(vec.begin(), vec.end(), ostream_iterator<int>(cout, " "));

  return 0;
}

The output of this program is also the same. Now, let's the advantage of the function object and try to make it more generalized. Currently, we can remove only the third element from the vector, but with the help of one member variable, we can make it configurable. And, with the help of one more member variable, we can store the count of how many times the operator()() function is being called.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

template <typename T>
class RemoveNth
{
private:
  T m_tValue;
  T m_iCount;

public:
  RemoveNth(T p_tValue) : m_tValue(p_tValue), m_iCount(0) { }
  bool operator () (T)
  {
    return ++m_iCount == m_tValue;
  }
};

int main()
{
  vector<int> vec;

  for (int iIndex = 1; iIndex < 10; ++iIndex)
  {
    vec.push_back(iIndex);
  }

  vector<int>::iterator iter_ = remove_if(vec.begin(), vec.end(),
                                          RemoveNth<int>(3));

  vec.erase(iter_, vec.end());

  copy(vec.begin(), vec.end(), ostream_iterator<int>>(cout, " "));

  return 0;
}

What will be the output of this program? The output of this program depends on the implementation of the remove_if algorithm. You might get the output:

1 2 4 5 6 7 8 9

or you might get this output:

1 2 4 5 7 8 9

The implementation of remove_if might call the find_if algorithm first to search an element in the container, which satisfies the predicate. Its implementation might be something like this:

template<class _FwdIt, class _Pr> inline
_FwdIt remove_if(_FwdIt _First, _FwdIt _Last, _Pr _Pred)
{
  // remove each satisfying _Pred
  _First = find_if(_First, _Last, _Pred);

  if (_First == _Last)
    return (_First);    // empty sequence, all done
  else
  {
    // nonempty sequence, worth doing
    _FwdIt _First1 = _First;
    return (remove_copy_if(++_First1, _Last, _First, _Pred));
  }
}

And the signature of the find_if algorithm may be something like this:

template<class _InIt, class _Pr> inline
_InIt find_if(_InIt _First, _InIt _Last, _Pr _Pred)

Take a look at the last parameter of this function predicate; it is passed by value, not by reference, in this function and this is exactly required by the standard. Therefore, its ( thepredicate's) copy is created, which is passed in the find_if algorithm. So, this predicate is different than what is passed in the remove_if algorithm. In short, one copy of the predicate is responsible for removing the third element and the other copy of the predicate is responsible for removing the sixth element. Note that 9 is not removed from the list; this means it doesn't remove all elements whose position is a multiple of the third.

On the other hand, if the implementation of the remove_if algorithm is something like this, its behavior is different.

template<class _FwdIt, class _Pr> inline
_FwdIt remove_if(_FwdIt _First, _FwdIt _Last, _Pr _Pred)
{
  // remove each satisfying _Pred
  for (; _First != _Last; ++_First)
    if (_Pred(*_First))
      break;

  if (_First == _Last)
    return (_First);    // empty sequence, all done
  else
  {
    // nonempty sequence, worth doing
    _FwdIt _First1 = _First;
    return (remove_copy_if(++_First1, _Last, _First, _Pred));
  }
}

Then, the output of the program is something like this:

1 2 4 5 6 7 8 9

That is equal to the function pointer predicate. The other possible solution is to pass the predicate by reference in the find_if algorithm, but this is not what the standard requires. It is not specified in the standard how to implement the remove_if algorithm; therefore, vendors of the library are free to choose any method.

References

  1. [ISO98] International Standard Programming Language C++ ISO/ICE 14882 1998
  2. [JOS99] The C++ Standard Library: A Tutorial and Reference. Nicolai M. Josuttis 1999


About the Author

Zeeshan Amjad

C++ Developer at Bechtel Corporation. zamjad.wordpress.com