dcsimg

WEBINAR:
On-Demand

Application Modernization: What Is It and How to Get Started


Introduction

Ever since I started working with Python, that has gotten me into a lot thinking how to redesign my libraries to be pythonic, if I were to implement them from scratch again. In this first article of the series, I want to introduce Python's wonderful and intuitive operators for working with set algebra into the C++ world. These operators are nothing more than syntatic-sugar to reduce the amount of code to write.

C++ Reference: Table of Python Set Operators
Figure 1: C++ Reference: Table of Python Set Operators

Set Intersection

C++ Reference: Set intersection
Figure 2: C++ Reference: Set intersection

Commutative

set_intersection is an algorithm that produces a set of elements that are common to both sets. It is commutative, meaning that even if the two sets switched places, the algorithm returns the same result.

void intersection_example()
{
   std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
   std::vector<int> v2{ 5, 7, 9,10 };
   std::sort(v1.begin(), v1.end());
   std::sort(v2.begin(), v2.end());
   std::vector<int> v_intersection;
   std::set_intersection(v1.begin(), v1.end(), v2.begin(),
      v2.end(),
   std::back_inserter(v_intersection));
   for (int n : v_intersection) std::cout << n << ' ';
}

Output

5 7

This is the example using the & operator to do an intersection.

void intersection_example()
{
   std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
   std::vector<int> v2{ 5, 7, 9,10 };
   std::vector<int> v_intersection = s(v1) & s(v2);
   for (int n : v_intersection)
      std::cout << n << ' ';
}

I skip showing the output of the operator example because it is the same.

s is a function, not a class. If s is to be a class, to instantiate it, a container type would have to be specified (see below).

std::vector<int> v_intersection = s<std::vector
   <int>>(v1) & s<std::vector<int>>(v2);

To make use of automatic type deduction, s has to be a function that does nothing but return the wrapper class.

#include <algorithm>
#include <iterator>
template<typename T>
struct wrapper
{
   wrapper(T& container) : cont(container) {} T& cont; };
   template<typename T>
   wrapper<T> s(T& s_cont)
   {
      return wrapper<T>(s_cont);
}

The & operator function checks whether to sort the container. Because std::sort only works with random access iterators, we cannot use this function with an STL list and slist, which have non-random access iterators. In my 15 years of work, I have not seen a single use of the list in any codebase.

template<typename T> T operator&(wrapper<T>& left,
   wrapper<T>& right)
{
   T& c1 = left.cont;
   T& c2 = right.cont;
   if (!std::is_sorted(c1.begin(), c1.end()))
      std::sort(c1.begin(), c1.end());
   if (!std::is_sorted(c2.begin(), c2.end()))
      std::sort(c2.begin(), c2.end());
   T v_intersection; std::set_intersection(c1.begin(), c1.end(),
      c2.begin(), c2.end(), std::back_inserter(v_intersection));
   return std::move(v_intersection);
}

All set algorithm preconditions require the ranges to be sorted. Therefore, this is a sorted check.

Set Union

C++ Reference: Set union
Figure 3: C++ Reference: Set union

Commutative

set_union is an algorithm that produces a set of elements from both sets. For the elements appearing in the intersection, it always picks them from the first set, not the second set.

void union_example()
{
   std::vector<int> v1 = { 1, 2, 3, 4, 5 };
   std::vector<int> v2 = { 3, 4, 5, 6, 7 };
   std::sort(v1.begin(), v1.end());
   std::sort(v2.begin(), v2.end());
   std::vector<int> dest1;
   std::set_union(v1.begin(), v1.end(), v2.begin(), v2.end(),
      std::back_inserter(dest1));
   for (const auto &i : dest1)
   {
      std::cout << i << ' ';
   }
   std::cout << '\n';
}

Output

1 2 3 4 5 6 7

The code required to write this is much less. Therefore, the code is more concise.

void union_example()
{
   std::vector<int> v1 = { 1, 2, 3, 4, 5 };
   std::vector<int> v2 = { 3, 4, 5, 6, 7 };
   std::vector<int> dest1 = s(v1) | s(v2);
   for (int n : dest1) std::cout << n << ' ';
}

The | operator is almost similar to the & operator except that the algorithm is different.

template<typename T> T operator|(wrapper<T>& left,
   wrapper<T>& right)
{
   T& c1 = left.cont; T& c2 = right.cont;
   if (!std::is_sorted(c1.begin(), c1.end())) std::sort(c1.begin(),
      c1.end());
   if (!std::is_sorted(c2.begin(), c2.end())) std::sort(c2.begin(),
      c2.end());
   T dest1; std::set_union(c1.begin(), c1.end(), c2.begin(),
      c2.end(),
   std::back_inserter(dest1));
   return std::move(dest1);
}

Set Difference

C++ Reference: Set difference
Figure 4: C++ Reference: Set difference

Non-Commutative

set_difference returns the elements in first set that is not in the second set; it is represented by the minus operator in Python. For obvious reasons, the result is different when the arguments have swapped place. set_difference is non-commutative, just as the minus operation.

void set_difference_example()
{
   std::vector<int> v1{ 1, 2, 5, 5, 5, 9 };
   std::vector<int> v2{ 2, 5, 7 };
   std::sort(v1.begin(), v1.end());
   std::sort(v2.begin(), v2.end());
   std::vector<int> diff;
   std::set_difference(v1.begin(), v1.end(), v2.begin(), v2.end(),
      std::inserter(diff, diff.begin()));
   for (auto i : v1) std::cout << i << ' ';
   std::cout << "minus "; for (auto i : v2)
   std::cout << i << ' ';
   std::cout << "is: ";
   for (auto i : diff) std::cout << i << ' ';
   std::cout << '\n';
}

Output

1 2 5 5 5 9 minus 2 5 7 is: 1 5 5 9

This is an example of the minus operator.

void set_difference_example()
{
   std::vector<int> v1{ 1, 2, 5, 5, 5, 9 };
   std::vector<int> v2{ 2, 5, 7 };
   std::vector<int> diff = s(v1) - s(v2);
   for (auto i : v1) std::cout << i << ' ';
   std::cout << "minus ";
   for (auto i : v2) std::cout << i << ' ';
   std::cout << "is: ";
   for (auto i : diff) std::cout << i << ' ';
   std::cout << '\n'; }

The code for the minus operator is shown below.

template<typename T> T operator-(wrapper<T>& left,
   wrapper<T>& right)
{
   T& c1 = left.cont; T& c2 = right.cont;
   if (!std::is_sorted(c1.begin(), c1.end())) std::sort(c1.begin(),
      c1.end());
   if (!std::is_sorted(c2.begin(), c2.end())) std::sort(c2.begin(),
      c2.end());
   T diff; std::set_difference(c1.begin(), c1.end(), c2.begin(),
      c2.end(), std::back_inserter(diff));
   return std::move(diff);
}

Set Symmetric Difference

C++ Reference: Set symmetric difference
Figure 5: C++ Reference: Set symmetric difference

Commutative

set_symmetric_difference computes the elements in either set, but not both.

void set_symmetric_difference_example()
{
   std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
   std::vector<int> v2{ 5, 7, 9,10 };
   std::sort(v1.begin(), v1.end());
   std::sort(v2.begin(), v2.end());
   std::vector<int> v_symDifference;
   std::set_symmetric_difference( v1.begin(), v1.end(),
      v2.begin(), v2.end(), std::back_inserter(v_symDifference));
   for (int n : v_symDifference) std::cout << n << ' ';
}

Output

1 2 3 4 6 8 9 10

set_symmetric_difference is represented by a logical exclusive or operator.

void set_symmetric_difference_example()
{
   std::vector<int> v1{ 1,2,3,4,5,6,7,8 };
   std::vector<int> v2{ 5, 7, 9,10 };
   std::vector<int> v_symDifference = s(v1) ^ s(v2);
   for (int n : v_symDifference) std::cout << n << ' ';
}

The code for logical exclusive or operator is shown next.

template<typename T> T operator^(wrapper<T>& left,
   wrapper<T>& right)
{
   T& c1 = left.cont; T& c2 = right.cont;
   if (!std::is_sorted(c1.begin(), c1.end())) std::sort(c1.begin(),
      c1.end());
   if (!std::is_sorted(c2.begin(), c2.end())) std::sort(c2.begin(),
      c2.end());
   T v_symDifference;
   std::set_symmetric_difference(c1.begin(), c1.end(),
      c2.begin(), c2.end(), std::back_inserter(v_symDifference));
   return std::move(v_symDifference);
}

Superset and Subset

C++ Reference: std::includes
Figure 6: C++ Reference: std::includes

Non-Commutative

STL includes can be used to find out whether a set is a superset (returns a Boolean). To check if it is a subset, just switch the two sets.

void is_superset_example()
{
   std::vector<char> v1{ 'a', 'b', 'c', 'f', 'h', 'x' };
   std::vector<char> v2{ 'a', 'b', 'c' };
   std::vector<char> v3{ 'a', 'c' };
   std::vector<char> v4{ 'g' };
   std::vector<char> v5{ 'a', 'c', 'g' };
   std::sort(v1.begin(), v1.end());
   std::sort(v2.begin(), v2.end());
   std::sort(v3.begin(), v3.end());
   std::sort(v4.begin(), v4.end());
   std::sort(v5.begin(), v5.end());
   for (auto i : v1) std::cout << i << ' ';
   std::cout << "\nincludes:\n" <<
   std::boolalpha; for (auto i : v2)
      std::cout << i << ' ';
   std::cout << ": " <<
   std::includes(v1.begin(), v1.end(), v2.begin(),
      v2.end()) << '\n';
   for (auto i : v3) std::cout << i << ' ';
   std::cout << ": " << std::includes(v1.begin(),
      v1.end(), v3.begin(), v3.end()) << '\n';
   for (auto i : v4) std::cout << i << ' ';
   std::cout << ": " << std::includes(v1.begin(),
      v1.end(), v4.begin(), v4.end()) << '\n';
   for (auto i : v5) std::cout << i << ' ';
   std::cout << ": " << std::includes(v1.begin(),
      v1.end(), v5.begin(), v5.end()) << '\n';
   auto cmp_nocase = [](char a, char b)
   {
      return std::tolower(a) < std::tolower(b);
   };
   std::vector<char> v6{ 'A', 'B', 'C' };
   for (auto i : v6) std::cout << i << ' ';
   std::cout << ": (case-insensitive) " <<
      std::includes(v1.begin(), v1.end(), v6.begin(), v6.end(),
      cmp_nocase) << '\n';
}

Output

a b c f h x includes: a b c : true a c : true g : false a c g :
   false A B C : (case-insensitive) true

The >= operator example is below. The <= operator example is not shown in this article.

void is_superset_example()
{
   std::vector<char> v1{ 'a', 'b', 'c', 'f', 'h', 'x' };
   std::vector<char> v2{ 'a', 'b', 'c' };
   std::vector<char> v3{ 'a', 'c' };
   std::vector<char> v4{ 'g' };
   std::vector<char> v5{ 'a', 'c', 'g' };
   for (auto i : v1) std::cout << i << ' ';
   std::cout << "\nincludes:\n" << std::boolalpha;
   for (auto i : v2) std::cout << i << ' ';
   std::cout << ": " << (s(v1) >= s(v2)) << '\n';
   for (auto i : v3) std::cout << i << ' ';
   std::cout << ": " << (s(v1) >= s(v3)) << '\n';
   for (auto i : v4) std::cout << i << ' ';
   std::cout << ": " << (s(v1) >= s(v4)) << '\n';
   for (auto i : v5) std::cout << i << ' ';
   std::cout << ": " << (s(v1) >= s(v5)) << '\n';
   auto cmp_nocase = [](char a, char b)
   {
      return std::tolower(a) < std::tolower(b);
   };
   std::vector<char> v6{ 'A', 'B', 'C' };
   for (auto i : v6) std::cout << i << ' ';
   std::cout << ": (case-insensitive) " <<
      std::includes(v1.begin(), v1.end(), v6.begin(), v6.end(),
   cmp_nocase) << '\n';
}

The user cannot opt for use of a custom comparator in the >= and <= overloaded operators at the moment, as shown in the case-insensitive example. In this situation, includes have to be called directly.

// Returns true if left is superset of right?
template<typename T>
bool operator>=(wrapper<T>& left, wrapper<T>& right)
{
   T& c1 = left.cont; T& c2 = right.cont;
   if (!std::is_sorted(c1.begin(), c1.end())) std::sort(c1.begin(),
      c1.end());
   if (!std::is_sorted(c2.begin(), c2.end())) std::sort(c2.begin(),
      c2.end());
   return std::includes( c1.begin(), c1.end(), c2.begin(),
      c2.end());
}
// Returns true if left is subset of right?
template<typename T>
bool operator<=(wrapper<T>& left,
   wrapper<T>& right)
{
   T& c1 = left.cont; T& c2 = right.cont;
   if (!std::is_sorted(c1.begin(), c1.end())) std::sort(c1.begin(),
      c1.end());
   if (!std::is_sorted(c2.begin(), c2.end())) std::sort(c2.begin(),
      c2.end());
   return std::includes( c2.begin(), c2.end(), c1.begin(),
      c1.end());
}

"I Have No Use for All These!"

Before you are quick to exclaim that you have no use for these set algorithms, I woud like to show to you a typical selection example where you can use this. Imagine you are writing a subject enrollment Web site for college students. On the form, there are currently selected subjects which the student added, and the available subject drop-down which the student can pick. It makes sense to remove the subject from the available drop-down after addition because you do not want the student to accidentally add the same subject twice. One way to compute leftover subjects available for selection is to just subtract the selected set from the complete set of subjects with the minus operator introduced in this article.

Article source code is hosted at GitHub.



Comments

  • There are no comments yet. Be the first to comment!

  • You must have javascript enabled in order to post comments.

Leave a Comment
  • Your email address will not be published. All fields are required.

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date