Portability of the STL remove_if Algorithm (or The Lack of)

Environment: CPP, STL, Portable

The standard of C++ [ISO98] does not specify all the issues about writing portable code. There are some issues that are left to the implementation of the compiler and library. All of those issues should be handled carefully to make your code portable and this makes writing portable code challenging as well as fun.

The same situation may arise in case of using standard template library. The standards specify the specification of an algorithm but not the implementation detail of the algorithm, so it may be different on different platforms. One such situation may arise in the remove_if algorithm in the standard template library [JOS99].

Let's take a look at the first simple example of the usage of the remove_if algorithm. This algorithm requires two forward iterators and one predicate.

template<class _FwdIt, class _Pr> inline
_FwdIt remove_if(_FwdIt _First, _FwdIt _Last, _Pr _Pred)

According the Standard of C++, "It is unspecified whether any global function in the C++ Standard Library is defined as inline." So, it may or may not be inline on different platforms but this is not our immediate problem.

A predicate can be a function object or a function pointer. The behavior of the remove_if algorithm is slightly different between the function object and function pointer, although it shouldn't be. A function object can store the state in a member variable; however, a function pointer cannot. In case of a function pointer predicate, the function can maintain its state by taking a static variable to simulate the behavior of function object, but you can make two objects of the function object store two different states and it is not possible in case of simple functions. And this is the slight difference that may create different behaviors on different implementations.

Here is the example to show the usage of a function pointer as a predicate. The Predicate function is used to remove the third element in the vector.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

bool Remove3rd(int iVal)
{
  return 3 == iVal;
}

int main()
{
  vector<int> vec;

  for (int iIndex = 1; iIndex < 10; ++iIndex)
  {
    vec.push_back(iIndex);
  }

  vector<int>::iterator iter_ = remove_if(vec.begin(), vec.end(),
                                          Remove3rd);

  vec.erase(iter_, vec.end());

  copy(vec.begin(), vec.end(),
       ostream_iterator<int>(cout, " "));

  return 0;
}

The output of this program is:

1 2 4 5 6 7 8 9

This is the exactly same output, which we need. Now, try to make the same program with a function object as a predicate instead of a function pointer.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

template <typename T>
class Remove3rd
{
public:
  bool operator () (T iValue)
  {
    return 3 == iValue;
  }
};

int main()
{
  vector<int> vec;

  for (int iIndex = 1; iIndex < 10; ++iIndex)
  {
    vec.push_back(iIndex);
  }

  vector<int>::iterator iter_ = remove_if(vec.begin(), vec.end(),
                                          Remove3rd<int>());

  vec.erase(iter_, vec.end());

  copy(vec.begin(), vec.end(), ostream_iterator<int>(cout, " "));

  return 0;
}

The output of this program is also the same. Now, let's the advantage of the function object and try to make it more generalized. Currently, we can remove only the third element from the vector, but with the help of one member variable, we can make it configurable. And, with the help of one more member variable, we can store the count of how many times the operator()() function is being called.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

template <typename T>
class RemoveNth
{
private:
  T m_tValue;
  T m_iCount;

public:
  RemoveNth(T p_tValue) : m_tValue(p_tValue), m_iCount(0) { }
  bool operator () (T)
  {
    return ++m_iCount == m_tValue;
  }
};

int main()
{
  vector<int> vec;

  for (int iIndex = 1; iIndex < 10; ++iIndex)
  {
    vec.push_back(iIndex);
  }

  vector<int>::iterator iter_ = remove_if(vec.begin(), vec.end(),
                                          RemoveNth<int>(3));

  vec.erase(iter_, vec.end());

  copy(vec.begin(), vec.end(), ostream_iterator<int>>(cout, " "));

  return 0;
}

What will be the output of this program? The output of this program depends on the implementation of the remove_if algorithm. You might get the output:

1 2 4 5 6 7 8 9

or you might get this output:

1 2 4 5 7 8 9

The implementation of remove_if might call the find_if algorithm first to search an element in the container, which satisfies the predicate. Its implementation might be something like this:

template<class _FwdIt, class _Pr> inline
_FwdIt remove_if(_FwdIt _First, _FwdIt _Last, _Pr _Pred)
{
  // remove each satisfying _Pred
  _First = find_if(_First, _Last, _Pred);

  if (_First == _Last)
    return (_First);    // empty sequence, all done
  else
  {
    // nonempty sequence, worth doing
    _FwdIt _First1 = _First;
    return (remove_copy_if(++_First1, _Last, _First, _Pred));
  }
}

And the signature of the find_if algorithm may be something like this:

template<class _InIt, class _Pr> inline
_InIt find_if(_InIt _First, _InIt _Last, _Pr _Pred)

Take a look at the last parameter of this function predicate; it is passed by value, not by reference, in this function and this is exactly required by the standard. Therefore, its ( thepredicate's) copy is created, which is passed in the find_if algorithm. So, this predicate is different than what is passed in the remove_if algorithm. In short, one copy of the predicate is responsible for removing the third element and the other copy of the predicate is responsible for removing the sixth element. Note that 9 is not removed from the list; this means it doesn't remove all elements whose position is a multiple of the third.

On the other hand, if the implementation of the remove_if algorithm is something like this, its behavior is different.

template<class _FwdIt, class _Pr> inline
_FwdIt remove_if(_FwdIt _First, _FwdIt _Last, _Pr _Pred)
{
  // remove each satisfying _Pred
  for (; _First != _Last; ++_First)
    if (_Pred(*_First))
      break;

  if (_First == _Last)
    return (_First);    // empty sequence, all done
  else
  {
    // nonempty sequence, worth doing
    _FwdIt _First1 = _First;
    return (remove_copy_if(++_First1, _Last, _First, _Pred));
  }
}

Then, the output of the program is something like this:

1 2 4 5 6 7 8 9

That is equal to the function pointer predicate. The other possible solution is to pass the predicate by reference in the find_if algorithm, but this is not what the standard requires. It is not specified in the standard how to implement the remove_if algorithm; therefore, vendors of the library are free to choose any method.

References

  1. [ISO98] International Standard Programming Language C++ ISO/ICE 14882 1998
  2. [JOS99] The C++ Standard Library: A Tutorial and Reference. Nicolai M. Josuttis 1999


About the Author

Zeeshan Amjad

C++ Developer at Bechtel Corporation. zamjad.wordpress.com

Comments

  • What the article shows

    Posted by Legacy on 01/12/2004 12:00am

    Originally posted by: Yves

    I think that the article shows one thing very clearly:
    "Do not rely on unspecified behaviour."

    I checked through the standard (latest version with amendments) and could find nothing that would indicate that you could not use predicates with a state in STL algorithms. It's not the internal state of the predicate that's at fault here, but the fact that it assumes that a) it will not be copied and b) the fact that remove_if will check each element from left to right. Anyways, regarding your question, I've seen that all implementations I've come across first call find_if, so they all produce the 1 2 4 5 7 8 9 sequence. This with VC6 and VC7 Dinkumware, STLPort and gcc's implementation of STL.

    This said, it seems like a very bad idea to use a predicate like the one in the article for the obvious reason that the implementation is not specified by the standard. For bidirectional iterators, remove_if could just as well start at the end of the sequence e.g. without violating the requirements. If the article shows one thing it's that the only thing you can rely on in STL is the specified behaviour. If it's not specified, it can be implemented in any way and so be not portable across different STL implementations.

    So in essence, it's not that remove_if is not portable, but that the predicate Zeeshan uses is non-portable. It relies on a particular behaviour that is not specified.

    Reply
  • Doesn't the result "1 2 4 5 7 8 9" violate the standard then?

    Posted by Legacy on 01/09/2004 12:00am

    Originally posted by: kevinHall

    It seems to me that the version of remove_if that produces "1 2 4 5 7 8 9" would be violating the standard then. Is this a correct assessment?

    Also, what STL version produced the "1 2 4 5 7 8 9" result? Is it some popular STL version? Would it happen to be the original STL that ships with MSVC 6.0?

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • On-demand Event Event Date: December 18, 2014 The Internet of Things (IoT) incorporates physical devices into business processes using predictive analytics. While it relies heavily on existing Internet technologies, it differs by including physical devices, specialized protocols, physical analytics, and a unique partner network. To capture the real business value of IoT, the industry must move beyond customized projects to general patterns and platforms. Check out this webcast and join industry experts as …

  • On-demand Event Event Date: October 29, 2014 It's well understood how critical version control is for code. However, its importance to DevOps isn't always recognized. The 2014 DevOps Survey of Practice shows that one of the key predictors of DevOps success is putting all production environment artifacts into version control. In this webcast, Gene Kim discusses these survey findings and shares woeful tales of artifact management gone wrong! Gene also shares examples of how high-performing DevOps …

Most Popular Programming Stories

More for Developers

RSS Feeds