Changes to the C++ Compiler in Microsoft Visual Studio 2010

In a previous article I wrote about some of the changes to VC++ in Visual Studio 2010, new build system, multi-targeting, new IntelliSense.
There is more in store for VC++ in this new version and this article will focus on the changes to the C++ compiler. I already wrote an article dedicated to lambda expressions, but the purpose of this new article is to cover all of these changes.

C++0x will be the next standard for C++. It was code-named 0x because it was supposed to be finished at latest in 2009. We are now in 2010 and it’s for sure that the standard won’t be approve this year either. Boris Jabes joked saying they will probably use hex values, and he is probably right.
However, there is a set of features that were already approved and some of them were included in the C++ compiler from Visual Studio 2010. This includes lambda expressions, auto, decltype, rvalue references and static_assert. I will write about all of them, without getting very deep though.
I suggest additional readings from the MSDN library or the VC++ Team blog.

static_assert

Checks if an expression holds true at compile time. If the expression if false a custom error message is displayed and the compilation fails. If the expression is true the declaration has no effect.

This feature is especially powerful together with the type traits clases introduced in tr1 namespace in Visual Studio 2008.

In the following example I create a comparison template function, that is used later to compare values.

template <typename T>
bool CompareNumbers(T v1, T v2)
{
   return v1 > v2;
}

int main()
{
   bool ret1 = CompareNumbers(1, 20);
   bool ret2 = CompareNumbers("b", "a");

   return 0;
}

The problem is that I don’t what this function to be used to anything else than integral types (maybe because it doesn’t make sense for other types, even if they have an operator > defined).
But the code compiles and runs, and actually I would like a compilation error for the call to Compare where I passed strings.

Adding a static_assert check will generate an compilation error for the second call to the function.

#include <type_traits>

template <typename T>
bool CompareNumbers(T v1, T v2)
{
   static_assert(std::tr1::is_integral<T>::value, "Type is not numeric");
   return v1 > v2;
}
1>d:mariusvc++cpp0xcpp0x.cpp(62): error C2338: Type is not numeric
1>          d:mariusvc++trainningscpp0xcpp0x.cpp(75) : see reference to function template instantiation 'bool CompareNumbers<const char*>(T,T)' being compiled
1>          with
1>          [
1>              T=const char *
1>          ]

auto

The auto keyword existed before, but in the previous version of the standard it didn’t have basically any meaning. So the new version has given it a new purpose. The keyword is used to deduce the type of a declared variable from its initialization expression.
The initialization expression can be an assignment, direct initialization or operator new expression. However, the auto keyword is just a placeholder, not a type, and cannot be used with sizeof or typeid. Here are some examples:

auto i = 13;        // i is int
auto s = "marius";  // s is std::string
auto p = new foo(); // p is foo*

Auto is pretty useful, for instance, for avoiding writing long types in for statements.

vector<int> numbers;
generate_n(back_inserter(numbers), 10, rand);

for(vector<int>::const_iterator it = numbers.begin(); it != numbers.end(); ++it)
{
   cout << *it << endl;
}

The for can be now written simply as this:

for(auto it = numbers.begin(); it != numbers.end(); ++it)
{
   cout << *it << endl;
}

lambda expressions

As I mentioned in the beginning of the article I already wrote an article dedicated solely to the lambda expressions in C++.
I will not reiterate all the points from that article; instead I will try a short overview on lambdas.

A lambda functions is a function object whose type is implementation dependent; its type name is only available to the compiler. The lambda expression is composed of several parts:

  • lambda_introducer: this is the part that tells the compiler a lambda function is following. Inside the angled brackets a capture-list can be provided; this is used for capturing variables from the scope in which the lambda is created.
  • lambda-parameter-declaration: used for specifying the parameters of the lambda function.
  • lambda-return-type-clause: used for indicating the type returned by the lambda function. This is optional, because most of the time the compiler can infer the type. There are cases when this is not possible and then the type must be specified. For the example above, the return type (-> bool) is not necessary.
  • compound-statement: this is the body of the lambda.

Let’s see a lambda:

vector<int> numbers;
generate_n(back_inserter(numbers), 10, rand);

for_each(numbers.begin(), numbers.end(), [](int n) {cout << n << endl;});

Here [] is the lambda introducer, (int n) is the lambda parameter declaration, and {cout << n << endl;} is the lambda compound statement. There is no return type clause, because that is auto inferred by the compiler.
There are cases when the compiler cannot deduce the return value and then it must be specified explicitly. A lambda expression is a syntactic shortcut for a functor. The code above is equivalent to:

class functor_lambda
{
public:
   void operator()(int n) const
   {
      cout << n << endl;
   }
};

vector<int> numbers;
generate_n(back_inserter(numbers), 10, rand);

for_each(numbers.begin(), numbers.end(), functor_lambda());

Lambdas can capture variables from their scope by value, reference or both in any combination. In the example above, there was no value captured. This is a stateless lambda. On the other hand, a lambda that captures variables is said to have a state.

Lambdas are very useful used together with algorithms and make writing function objects obsolete.

rvalue references

Writing about rvalue references doesn’t make too much sense after Stephan T. Lavavej explained everything already on VC++ Team’s blog, in a post that I consider the ultimate guide to lvalues and rvalue, rvalue references and move semantics.
The only thing I will attempt is to give a very short summary on the topics. You must read Stephan’s post to get a grip on this topic.

In C++ there are two types of expressions: lvalue and rvalue. lvalues name objects that persist beyond the expression in which they are used, rvalue name temporaries that are destroyed at the end of the expression in which they were created.
lvalues are observable from different parts of a program, rvalues are temporaries and are not visible outside the expression in which they were defined.
An operator like operator+ for string takes two const references to string (or one const reference to string and one const char*) and returns a temporary string. However, when we chain several such operations, like “hello” + ” world” + “!”, there are several temporaries that are created and discarded immediateley.
However, operator+ cannot modify its parameters, because they are lvalues and can be used elsewhere, but the temporary object it returns is not used anywhere else, being an rvalue, so it could be actually modified without side effects.

This is where rvalue references enter the scene. They are used to hold a reference to a rvalue or lvalue expression, and are introduced with &&. They enable the implementation of move semantics and perfect forwarding.

Move semantics enable transferring resources from one temporary object to another. This is possible because temporary objects (i.e. rvalues) are not referred anywhere else outside the expression in which they live. To implement move semantics you have to provide a move constructor and optionally a move assignment operator.
The Standard Template Library was changed to take advantage of this feature. For instance, the operator+ for string was modified to use rvalue references, and can now append one string to another. The result is a lower number of memory allocations and deallocations and temporary objects created, which eventually increases speed.

Another classic example for the move semantics is represented by operation with sequences like vector or list. A vector allocates memory for a given number of objects. You can add elements to it and no re-allocation is done until the full capacity is reached. But when that happens, the vector has to reallocate memory. In this case it allocates a new larger chunk, copies all the existing content, and then releases the pervious memory.
When an insertion operation needs to copy one element several things happen: a new element is created, its copy constructor is called, and then the old element is destroyed. With moves semantics, the allocation of a new element and its copy is no longer necessary, the existing element can be directly moved.

A second scenario where rvalue references are helpful is the perfect forwarding. The forwarding problem occurs when a generic function takes references as parameters and then needs to forward these parameters to another function. If a generic function takes a parameter of type const T& and needs to call a function that takes T&, it can’t do that. So you need an overloaded generic function.
What rvalue references enable is having one single generic function that takes arbitrary arguments and then forwards them to another function.

struct foo
{
   foo(int&) {}
};

struct bar
{
   bar(const int&) {}
};

template <typename T, typename A>
T* make(A&& arg)
{
   return new T(arg);
}

int main()
{
   int x = 42;
   auto f = make<foo>(x);
   auto b = make<bar>(42);

   return 0;
}

decltype operator

This is used to yield the type of an expression. Its primary purpose is for generic programming, in conjunction with auto, for return types of generic functions where the type depends on the arguments of the function. Here are several examples:

int i = 42;          // decltype(i) yields int
const int&& f();     // decltype(f()) yields const int&&
struct foo {int i;}; // decltype(f.i) yields int (f being an object of type foo)

It can be used together with auto to declare late specified return type, with the alternative function declaration syntax, which is (terms in squared brackets indicate optional parts)

auto function_name([parameters]) [const] [volatile] -> decltype(expression) [throw] {function_body};

In general, the expression use with decltype here should match the expression used in the return statement. Here is an example:

struct Feet
{
   double value;
   explicit Feet(double val):value(val){}
};

struct Meters
{
   double value;
   explicit Meters(double val):value(val){}
};

ostream& operator<<(ostream& os, const Feet& f)
{
   os << f.value << "ft";
   return os;
}

ostream& operator<<(ostream& os, const Meters& m)
{
   os << m.value << "m";
   return os;
}

Feet operator+(const Feet& f1, const Feet& f2)
{
   return Feet(f1.value + f2.value);
}

Meters operator+(const Meters& m1, const Meters& m2)
{
   return Meters(m1.value + m2.value);
}

Feet operator+(const Feet& f, const Meters& m)
{
   return Feet(f.value + m.value*3.2808);
}

Meters operator+(const Meters& m, const Feet& f)
{
   return Meters(m.value + f.value*0.30480);
}

template <typename T1, typename T2>
auto Plus(T1&& v1, T2&& v2) -> decltype(forward<T1>(v1) + forward<T2>(v2))
{
   return forward<T1>(v1) + forward<T2>(v2);
}

int main()
{
   cout << Plus(f1, f2) << endl;
   cout << Plus(m1, m2) << endl;
   cout << Plus(f1, m1) << endl;
   cout << Plus(m2, f2) << endl;

   return 0;
}

The result of the execution is:

30ft
6m
26.404ft
7.096m

In this example function Plus takes two arguments that can be of possible different types. If we pass two arguments of the same type, then the return argument is obviously the same time.
However, if we pass arguments of two different types, then the result type depends of the arguments. In our case, when the first argument is Feet and second is Meters, than the result type must be Feet.
But if the first argument is Meters and second is Feet, then the result argument must be Meters. This is where auto and decltype are of help. Otherwise, the Plus functions and its calls should have looked like this:

template <typename T, typename T1, typename T2>
T Plus(T1&& v1, T2&& v2)
{
   return forward<T1>(v1) + forward<T2>(v2);
}

int main()
{
   cout << Plus<Feet>(f1, f2) << endl;
   cout << Plus<Meters>(m1, m2) << endl;
   cout << Plus<Feet>(f1, m1) << endl;
   cout << Plus<Meters>(m2, f2) << endl;

   return 0;
}

Obviously, anyone would prefer the former, because you don’t have to explicitly specify the type of the return value. With auto and decltype it can be inferred by the compiler.

References

For more information about these topics please see:

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read