Portability of the STL remove_if Algorithm (or The Lack of)

Environment: CPP, STL, Portable

The standard of C++ [ISO98] does not specify all the issues about writing portable code. There are some issues that are left to the implementation of the compiler and library. All of those issues should be handled carefully to make your code portable and this makes writing portable code challenging as well as fun.

The same situation may arise in case of using standard template library. The standards specify the specification of an algorithm but not the implementation detail of the algorithm, so it may be different on different platforms. One such situation may arise in the remove_if algorithm in the standard template library [JOS99].

Let's take a look at the first simple example of the usage of the remove_if algorithm. This algorithm requires two forward iterators and one predicate.

template<class _FwdIt, class _Pr> inline
_FwdIt remove_if(_FwdIt _First, _FwdIt _Last, _Pr _Pred)

According the Standard of C++, "It is unspecified whether any global function in the C++ Standard Library is defined as inline." So, it may or may not be inline on different platforms but this is not our immediate problem.

A predicate can be a function object or a function pointer. The behavior of the remove_if algorithm is slightly different between the function object and function pointer, although it shouldn't be. A function object can store the state in a member variable; however, a function pointer cannot. In case of a function pointer predicate, the function can maintain its state by taking a static variable to simulate the behavior of function object, but you can make two objects of the function object store two different states and it is not possible in case of simple functions. And this is the slight difference that may create different behaviors on different implementations.

Here is the example to show the usage of a function pointer as a predicate. The Predicate function is used to remove the third element in the vector.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

bool Remove3rd(int iVal)
{
  return 3 == iVal;
}

int main()
{
  vector<int> vec;

  for (int iIndex = 1; iIndex < 10; ++iIndex)
  {
    vec.push_back(iIndex);
  }

  vector<int>::iterator iter_ = remove_if(vec.begin(), vec.end(),
                                          Remove3rd);

  vec.erase(iter_, vec.end());

  copy(vec.begin(), vec.end(),
       ostream_iterator<int>(cout, " "));

  return 0;
}

The output of this program is:

1 2 4 5 6 7 8 9

This is the exactly same output, which we need. Now, try to make the same program with a function object as a predicate instead of a function pointer.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

template <typename T>
class Remove3rd
{
public:
  bool operator () (T iValue)
  {
    return 3 == iValue;
  }
};

int main()
{
  vector<int> vec;

  for (int iIndex = 1; iIndex < 10; ++iIndex)
  {
    vec.push_back(iIndex);
  }

  vector<int>::iterator iter_ = remove_if(vec.begin(), vec.end(),
                                          Remove3rd<int>());

  vec.erase(iter_, vec.end());

  copy(vec.begin(), vec.end(), ostream_iterator<int>(cout, " "));

  return 0;
}

The output of this program is also the same. Now, let's the advantage of the function object and try to make it more generalized. Currently, we can remove only the third element from the vector, but with the help of one member variable, we can make it configurable. And, with the help of one more member variable, we can store the count of how many times the operator()() function is being called.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

template <typename T>
class RemoveNth
{
private:
  T m_tValue;
  T m_iCount;

public:
  RemoveNth(T p_tValue) : m_tValue(p_tValue), m_iCount(0) { }
  bool operator () (T)
  {
    return ++m_iCount == m_tValue;
  }
};

int main()
{
  vector<int> vec;

  for (int iIndex = 1; iIndex < 10; ++iIndex)
  {
    vec.push_back(iIndex);
  }

  vector<int>::iterator iter_ = remove_if(vec.begin(), vec.end(),
                                          RemoveNth<int>(3));

  vec.erase(iter_, vec.end());

  copy(vec.begin(), vec.end(), ostream_iterator<int>>(cout, " "));

  return 0;
}

What will be the output of this program? The output of this program depends on the implementation of the remove_if algorithm. You might get the output:

1 2 4 5 6 7 8 9

or you might get this output:

1 2 4 5 7 8 9

The implementation of remove_if might call the find_if algorithm first to search an element in the container, which satisfies the predicate. Its implementation might be something like this:

template<class _FwdIt, class _Pr> inline
_FwdIt remove_if(_FwdIt _First, _FwdIt _Last, _Pr _Pred)
{
  // remove each satisfying _Pred
  _First = find_if(_First, _Last, _Pred);

  if (_First == _Last)
    return (_First);    // empty sequence, all done
  else
  {
    // nonempty sequence, worth doing
    _FwdIt _First1 = _First;
    return (remove_copy_if(++_First1, _Last, _First, _Pred));
  }
}

And the signature of the find_if algorithm may be something like this:

template<class _InIt, class _Pr> inline
_InIt find_if(_InIt _First, _InIt _Last, _Pr _Pred)

Take a look at the last parameter of this function predicate; it is passed by value, not by reference, in this function and this is exactly required by the standard. Therefore, its ( thepredicate's) copy is created, which is passed in the find_if algorithm. So, this predicate is different than what is passed in the remove_if algorithm. In short, one copy of the predicate is responsible for removing the third element and the other copy of the predicate is responsible for removing the sixth element. Note that 9 is not removed from the list; this means it doesn't remove all elements whose position is a multiple of the third.

On the other hand, if the implementation of the remove_if algorithm is something like this, its behavior is different.

template<class _FwdIt, class _Pr> inline
_FwdIt remove_if(_FwdIt _First, _FwdIt _Last, _Pr _Pred)
{
  // remove each satisfying _Pred
  for (; _First != _Last; ++_First)
    if (_Pred(*_First))
      break;

  if (_First == _Last)
    return (_First);    // empty sequence, all done
  else
  {
    // nonempty sequence, worth doing
    _FwdIt _First1 = _First;
    return (remove_copy_if(++_First1, _Last, _First, _Pred));
  }
}

Then, the output of the program is something like this:

1 2 4 5 6 7 8 9

That is equal to the function pointer predicate. The other possible solution is to pass the predicate by reference in the find_if algorithm, but this is not what the standard requires. It is not specified in the standard how to implement the remove_if algorithm; therefore, vendors of the library are free to choose any method.

References

  1. [ISO98] International Standard Programming Language C++ ISO/ICE 14882 1998
  2. [JOS99] The C++ Standard Library: A Tutorial and Reference. Nicolai M. Josuttis 1999


About the Author

Zeeshan Amjad

C++ Developer at Bechtel Corporation. zamjad.wordpress.com

Comments

  • What the article shows

    Posted by Legacy on 01/12/2004 12:00am

    Originally posted by: Yves

    I think that the article shows one thing very clearly:
    "Do not rely on unspecified behaviour."

    I checked through the standard (latest version with amendments) and could find nothing that would indicate that you could not use predicates with a state in STL algorithms. It's not the internal state of the predicate that's at fault here, but the fact that it assumes that a) it will not be copied and b) the fact that remove_if will check each element from left to right. Anyways, regarding your question, I've seen that all implementations I've come across first call find_if, so they all produce the 1 2 4 5 7 8 9 sequence. This with VC6 and VC7 Dinkumware, STLPort and gcc's implementation of STL.

    This said, it seems like a very bad idea to use a predicate like the one in the article for the obvious reason that the implementation is not specified by the standard. For bidirectional iterators, remove_if could just as well start at the end of the sequence e.g. without violating the requirements. If the article shows one thing it's that the only thing you can rely on in STL is the specified behaviour. If it's not specified, it can be implemented in any way and so be not portable across different STL implementations.

    So in essence, it's not that remove_if is not portable, but that the predicate Zeeshan uses is non-portable. It relies on a particular behaviour that is not specified.

    Reply
  • Doesn't the result "1 2 4 5 7 8 9" violate the standard then?

    Posted by Legacy on 01/09/2004 12:00am

    Originally posted by: kevinHall

    It seems to me that the version of remove_if that produces "1 2 4 5 7 8 9" would be violating the standard then. Is this a correct assessment?

    Also, what STL version produced the "1 2 4 5 7 8 9" result? Is it some popular STL version? Would it happen to be the original STL that ships with MSVC 6.0?

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Wednesday, September 24, 2014 8:00 AM - 9:00 AM PDT According to a recent Forrester Research report, many companies are choosing low-code platforms over traditional programming platforms, due to the speed with which low-code apps can be assembled and tested. With customer-facing applications on the rise, traditional programming platforms simply can't keep up with the "short schedules and rapid change cycles" required to develop these applications. Check out this upcoming webinar and join Clay Richardson from …

  • Java developers know that testing code changes can be a huge pain, and waiting for an application to redeploy after a code fix can take an eternity. Wouldn't it be great if you could see your code changes immediately, fine-tune, debug, explore and deploy code without waiting for ages? In this white paper, find out how that's possible with a Java plugin that drastically changes the way you develop, test and run Java applications. Discover the advantages of this plugin, and the changes you can expect to see …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds