C++/CLI Primer

Introduction

This article is not an extensive guide to master the C++/CLI programming language; rather, it is a quick start learning feature that offers an easier way for an unmanaged C++ programmer to enter the world of managed programming and still stick to C++. I hope that this article would prove useful for a C# or VB.NET or a pure managed programmer too to program in C++/CLI where the two programming worlds merge to offer the most powerful environment for programming.

Comparisons of C++/CLI with C#, VB.NET, or other .NET languages have almost not been made but if so, they are not made to win arguments but to show the differences, understand, and appreciate gotchas and subtleties. There are absolutely no references in this article to the (obsolete) Managed C++. So, jump in!!!

Words Of Agreement

The word "unmanaged" in a the broader sense encompasses any and all technologies (Win32, COM....) and programming languages (C++, VB, Pascal.....) prior to the inception of .NET. The word "managed" refers to the .NET technology itself and only those programming languages that support programming on the .NET platform. The words 'object' and 'instance' have been used interchangeably for the managed object.

.NET refers to or is the programming technology, platform, and standard. CLR (Common Language Runtime) is the implementation of .NET and is the runtime engine (platform) that programming languages generate IL (intermediate code) to get hosted against. CLR is the virtual processor that executes the IL generated by the various different programming languages available for programming on the .NET platform. C++/CLI is the one (superior one) of them. The article in its entirety is an attempt to start learning the same.

For the content of this article, C++ means the ANSI-ISO C++ (originally conceived by Bjarne Stroustrup). It is for programming in the unmanaged world, and cannot be used to program on the .NET platform. C++/CLI is not the same, and the article will delve in more detail about that. It must be considered as an entirely different language whose subset is the features and facilities of the ANSI-ISO C++. For the content of this article, unmanaged refers to programming through C++, although it is equivalent to do through any of the other unmanaged programming languages.

Unmanaged Programming Brief

You have to reap what you sow. I mean, in C++ (the unmanaged world), if you allocate memory by new/malloc, it is your responsibility to deallocate them using delete/free. Forgetting to deallocate the allocated memory after the formal consumption results in memory leaks. The compiler is tightly bound to the underlying operating system/hardware and uses the APIs exposed by the underlying OS for programming.

Managed Programming Brief

Programming in the managed world comprises the programming language used, the libraries (called the Base Class Library), and the CLR itself. The BCL is the gateway to the platform on which the program will be executed. The BCL provides all the APIs for programming, and are organized under various namespaces corresponding to the service intended—file system, memory, network, user interface, process and threads, and so forth. One of the several facilities in the managed programming is automatic memory management—allocation is the wish, de-allocation is automatically taken care of by the CLR by a process called Garbage Collection.

Types in the managed world are entities that bear information and on which operations are carried out by calling methods. Each type is unique by itself. To use the types, you create instances of types and work with it. Types (and their associated operations) are packaged and deployed as assemblies. An assembly is the ultimate unit of deployment and is the building block of a CLR-based application. An assembly is versioned; the version serves as its identity. An assembly is similar to the dynamic link library for the unmanaged world, although assemblies are themselves dynamic link libraries or executables. Types packaged in an assembly are accessible from outside based on the accessibility marked for the type. For instance, a class type marked public is accessible from outside, and so are its methods that are marked public.

What Is C++/CLI?

I know that might sound a boring start. But, C++/CLI needs a formal introduction. ANSI/ISO C++ is the programming language for programming on Windows. .NET is a new platform that is not tied to the hardware unlike before. It has its own execution engine, a virtual processor, which is the CLR. Although C++ generates an executable for the target platform, the managed programming languages generate IL (Intermediate Language) code for the CLR. Programming languages are required to be complaint with CLI (Common Language Infrastructure) and CTS (Common Type System) to be used for programming in the managed world.

ANSI/ISO C++ cannot be used to program on the .NET platform because it is not compliant with the CLI/CTS. Hence, C++/CLI: It is a new language (like C++ for C) that has been invented to program on the .NET platform. Although the syntax, grammar, and some of the rules are the same as C++, it must not be considered just an extension over C++. Instead, C++ is a subset of C++/CLI, which is not the ultimate intent of the invention.

C++/CLI is a secular programming language; this means it can be used for managed or unmanaged or mixed programming. Hence, legacy code that cannot be ported to the .NET platform (using C# or any other .NET language of choice) in a short time span can be ported easily with C++/CLI. Also, any new code in such legacy C++ projects can be written as pure managed code. It also bridges the gap for the pure managed languages that otherwise are handicapped in using unmanaged code without C++/CLI. So, your C# project can now use your complex algorithms or bunch of hi-fi utilities written in ANSI C++ just with a C++/CLI wrapper over them.

Types and Object Creation

There are three types of data types in C++/CLI: reference, value and nativ. Native types are those that already exist with C++, say int, float, class, struct, and so on. An instance of these types is allocated on the stack when created statically. When created dynamically (using the new keyword), they get allocated on the heap. It is the responsibility of the programmer to delete the allocated instance. Now, you as a C++ programmer might be well aware of the consequences if you fail to delete. So scary.... Memory leaks!!!

Value Types and Reference Types are a part of the managed world. They behave as the CLI dictates, the prime doctrine being to have a common base type—(System::Object). Following are the methods exposed by System::Object:

Method Name Return Type Accessibility
Equals bool public
GetType Type public
ToString System::String^ public
GetHashCode int public
Finalize - protected
MembewiseClone System::Object^ protected
ReferenceEquals bool public static

From a quick look, it can be understood that this is information required to abstract any type. And so is every type derived from System::Object.

Value Types are derived from System::ValueType, which is further derived from System::Object. The value types are always allocated on the stack, although there are times when they are transported to the heap. I will deal that later. But, the very nature of a value type is to get allocated on the stack. All primitive types and structs are value types. They bear certain similarities with the primitive native types, but they are not the same.

Primitive Types Mapping

(List not extensive)

Data Type Name Type Keyword
Integer System.Int32 int
Double System.Double double
Character [2 bytes] System.Char char
Character [1 Byte] System.Byte byte
Boolean System.Boolean bool

Following is the way primitive value types are declared and used:

void InSomeMethod()
{
   // You may also use the int keyword instead.
   System::Int32 oddNumber = 1;
   CallAnotherMethod(oddNumber);

   // You may also use the char keyword instead.

   System::Char character = 'A';
   CallMethod3(character);
}

Following is the way user defined value types are declared and used:

ref struct DateTimeInfo
{
   private: System::Int32 Year;
   private: System::char Month;
   private: System::char Date;

   // NOTE: Cannot declare default ctor in structs. public:
      DateTimeInfo(int year,
         char mon, char date) { }
            public: int GetYear() {
               return this->Year;
      } public: int
      GetMonth() { return static_cast(this->Month);
      } public: int
      GetDate() { return static_cast(this->Date);
      } };    // Well, you know how to use it !!!

Classes are the reference types. The instances of reference types are never allocated on the stack. They are always allocated on the heap, and this heap is not the same heap where your native types are allocated. This is a different area called the managed heap. Your native type or code has no idea about or direct reach to this area. So then, how do you allocate on the managed heap? Is it by using the new keyword? If so, how does the new keyword where to allocate then? To answer these, there is a newer keyword called gcnew. 'new' allocates on the native heap and gcnew allocates on the managed heap. Examples and code snippets are not appropriate yet, but just consider this for now:

reference_data_type objRef =
   gcnew appropraite ctor of reference_data_type

This is the conventional way of creating a managed object in C++/CLI. As said above, the instance is created on the managed heap. The accessor for that instance is called the object reference (objRef above) or handle, and it is available on the stack. I hope you can imagine that and agree. The C++/CLI convention is to call them handles. But, I am going to call them object reference, which is the term widely used in the managed world. The term object reference must not in any way be related to the C++ reference. So, the word reference in the rest of the article refers to the managed object reference only, unless and until explicitly distinguished. The instance cannot be accessed without the object reference. In essence, object references are address holders. But, they are not like native pointers. Object references are type aware, polymorphic, and exhibit the type's behavior. They are not just addresses unlike pointers. Object references cannot be cast to any type desired and moved by incrementing or decrementing the address unlike pointers. They are much more intelligent address holders unlike native pointers. There can be more than one reference for the instance on the managed heap. An assignment of object reference to other is a shallow copy. This may be news to native C++ programmers. So, what is this? There are now two references referring to the same instance on the heap and so if one gets out of scope or is deleted, the instance is scratched, leaving the other reference dangling. Typical C++ programmer's nightmare which might have been learnt the hard way.

Turn of the century...........GC !!! Yes, there can be more than one object reference referring the object on the heap. The managed programming model does not expect the programmer to do the memory reclamation. It is not required that the programmer write code such as delete objRef to deallocate and return back the memory he allocated. Spare our poor programmers. The CLR is very smart and reclaims memory through a process called Garbage Collection. The Garbage Collector reclaims only those instances that are not reachable—for which you lose the object references (like objRef above). If the object reference goes out of scope or if it is assigned null, the instance it was referring to cannot be reached through this reference anymore. So, for an instance memory to be reclaimed by the GC, there must be no outstanding references.

This is the most compelling feature of .NET. Programmers are now free of the burden to write code to delete the memory they allocate, which has been the tough schooling they have gone through in these several years of programming. But beware.......too much freedom results in chaos. Even with GC, memory has to be allocated wisely. Undisciplined allocations, for the fact that you are not responsible for deallocations, will result in poor performance of the application. This is one of the fundamental differences between the native and managed worlds. All the other concepts and rules are based on reference types, object references, and garbage collection. Besides that, there is a subtle thing to be aware of. The GC is responsible only for the deallocating the memory and not the resource. For instance, if you have opened a connection with a database, the GC is not responsible for closing the connection; instead, it is responsible only for reclaiming the memory allocated for the database object.

With the basics learnt in the previous sections, it is time to see stuff that works. Following is a snippet of a C++/CLI class:

C++/CLI Primer

Listing 1

ref class Directory
{
   // Creates an instance with the current directory path

   public: Directory(); 

   // Creates an instance with the specified directory path
   public: Directory(System::String^ filePath);

   // Assume File is another managed class
   public: File^ GetFile(System::String^ fileName);

   public: cli::array<System::String^, 1>^ GetFiles();
   public: cli::array<System::String^, 1>^
      GetFiles(System::String^ filter);
   public: System::Void DeleteFile(System::String^ fileName);

   // Imagine a few other methods....

   // Object destructor or the Dispose method
   ~Directory();

   // Finalizer
   !Directory();
};

The above class shall be used for the explanatory purposes: keywords, usage, or related concepts.

Declaring and Consuming A Managed Class

Listing 1 is the typical way of declaring a managed class in C++/CLI. The ref keyword preceding the class keyword distinguishes it as a managed class and is a candidate for getting allocated on the managed heap only. See how to create an instance of the above class:

Directory^ sysDir = gcnew Directory();

The caret [^] symbol specifies that the variable sysDir is a managed handle (sorry, a reference to a managed object). sysDir is the object reference that you now can use to access the allocated object. You can call public methods and you can copy the reference to another reference variable.

Directory^ sysDir2 = sysDir;

Now, sysDir and sysDir2 both refer to the same instance. It is not required from your side to explicitly delete the object as you used to do with the C++. But, the effect of calling delete on the instance (delete sysDir) is discussed in the next section. The memory reclamation part is now a responsibility of the .NET runtime (GC). This is really a big relief for the programmer. Following is the way you invoke methods on the Directory instance:

File^ someFile = sysDir->GetFile("SomeFile.TXT");

Consider the following method:

System::Void UseSysDir(Directory^ dirObjRef)
{
    cli::array<system::string^,>^ files = dirObjRef->GetFiles();
    // Do something with files.

}
</system::string^,>

The object reference now can be passed to methods as parameters and can be accessed the same way inside the methods too. All of the references refer to the same instance on the managed heap. There is no copy construction involved anywhere because a copy of the object is not created. It is similar to passing pointers in C++. In case you need to create a copy, you can derive your class from System::IClonable and implement the Clone() method. The actual depth of the copy depends on the specific object, and each inner object may or may not require a Clone method in turn. It might be very hard at first for a C++ programmer to accept this, and pass around references for the same object, when you might have learnt to implement a copy constructor, but in the due course of programming, you will definitely learn that programming with objects on the heap and the memory reclamation by garbage collector is a different model altogether.

Consider the following code:

Directory^ CreateDirectory(System::String^ dirPath)
{
   // Some code to check...if you want
   // (just to make the method look big !!!)
   Directory^ dirObj = gcnew Directory(dirPath);
   return dirObj;
}

An instance of the Directory class is created and a reference to the allocated instance is returned. After returning, the dirObj no more will refer to the object on the heap. It is the responsibility of the calling method to grab the returned reference and preserve it so that this is not spotted by the GC as orphaned or garbage. When there is at least one direct or indirect object reference that refers a particular object, the GC will not attempt to reclaim the memory being consumed by that object.

Abstract Classes

In simple terms, the abstract keyword decorated on a managed class makes it abstract. This is a convenient way of making classes abstract without declaring abstract (pure virtual) methods. Also, methods can be decorated with the abstract keyword, in which case the containing class must also be decorated the same way. Following are explanatory code snippets:

ref class AnAbstractClass abstract
{
   // ctor and other methods that have method bodies
};

or

// Cannot create instances of AnotherAbstractClass
ref class AnotherAbstractClass abstract
{

   // member declaration
   // ctor
   // methods with bodies

   public: virtual void SomeMethod(int x) abstract;

   // Making a method abstract requires the class to decorated
   // with the abstract keyword.
};

ref class DerivedFromAnotherAbstractClass :
   public AnotherAbstractClass
{
   // This is the overriding implementation of SomeMethod.
   // Preceding virtual keyword and the override suffix keyword
   // mandatory to denote that we intend to override SomeMethod
   public: virtual void SomeMethod(int x) override
   {
      // impl
   }
};

nullptr

A managed object reference is similar to a C++ pointer in one way; it can refer an object or it refers to nothing. When a C++ pointer is NULL, it does not to point any location in memory. Similarly, when an object reference does not point to any object, its value is nullptr. nullptr is a keyword in C++/CLI. It is not a type like int or float. It is an indication to say that the object reference does not refer any object. Because it is not a type, no type operations can be done on nullptr—sizeof(nullptr), throw nullptr, and the like all result in compiler errors.

  • A nullptr can be assigned to an object reference as part of the declaration or later.
  • Directory^ dirObjRef = nullptr;
  • A nullptr can be assigned explicitly even when the reference is referring to some other object.
  • Directory^ dirObjRef = gcnew Directory(some directory
       path string);
    
    dirObjRef->DeleteFile("SomeFile.TXT");
    
    // Possible candidate for GC, if there are no other
    // references for the object
    dirObjRef = nullptr;
    
  • A nullptr can be used for comparing with an object reference but other arithmetic operators (+, -, >, < etc) are not allowed.
    • if (dirObjRef == nullptr) { throw some exception or as you wish.... }
    • if ( dirObjRef != nullptr) { .... }
  • A nullptr can be passed to methods as parameters and can return values too.
  • dirObjRef->GetFile(nullptr);

    and

    File^ GetFile("SomeFile.TXT")
    {
       // ......
       // If file not found
    
       return nullptr;
    }
    
  • A nullptr can be assigned to a managed reference, interior pointer (discussed later), and a native pointer.

Boxing/Unboxing

As you saw earlier, the value types are allocated on the stack. But, there are times when they are present on the managed heap. For instance, when a method takes an System::Object (mother of all managed types) as the parameter for, say printing the contents. In such cases, an object is allocated on the heap with the value of the value type copied to it. This process is called Boxing. Sample code that shows boxing follows:

int i =100;
   // Boxing
System::Object^ boxObj = safe_cast<System::Object^>(i);

or

void PrintContents(Object^ objRef)
{
   // blah, blah blah....

   Console::WriteLine("Contents: {0}", objRef->ToString());
}

InSomeMethod()
{
   int i = 100;
   PrintContents(i);    // i is boxed
}

The opposite of boxing is called unboxing, and it is retrieving the value of the instance from the heap and loading it on the variable on the stack.

void Unbox()
{
   // box
   System::Object^ integerObj = safe_cast<System::Object^>(100);
   // unbox
   int i = safe_cast<int>(integerObj);
}

Boxing and unboxing are applicable only for value types.

C++/CLI Primer

Object Destruction

This is a very fuzzy but interesting area. In C++, the need for a destructor is do cleanup operations on the object before the memory is reclaimed. And, the destruction of the object is deterministic. The destructor for an object allocated on the stack is called when it goes out of scope. For an object allocated on the heap, it is called when delete is called. If you fail to call delete (after the formal consumption of the object), the destructor is never called and the memory held by the object is not released—Memory leaks. Now, you know that entire story.

Ideally, there are no destructors for managed objects because the destruction of such objects is not deterministic. The garbage collector reclaims the memory held by the object at an arbitrary point of time (and on an arbitrary thread). If that is case, what is the way to do cleanup operations on the object, even though you do not care about the memory being reclaimed? There is a way.

When you are done using an object, there are two ways available for cleanup: explicit and finalization. If you know that you are done with the object and want to do cleanup explicitly, the .NET advice is this: Implement the System::IDisposable interface for your object. Call the Dispose method on the object to do the (explicit) cleanup, and leave the memory reclamation part to the GC.

The problem with Dispose method is that it has to be called explicitly. If you fail, the cleanup will not be done. There is another point in an object's lifetime when you have the last chance to do cleanup, even after you have given up all references to the object. That is when the object gets finalized. When the garbage collector finds an orphaned or garbage object, it adds that object to a special queue (called the Finalization Queue), and an another thread 'Finalizer Thread' calls a Finalize() method on each of the queued objects. This process is called Finalization. What I have said about finalization is unimaginably succinct. There are more finer and intricate details that require a separate article (or probably a book). But, whatever said is sufficient for now. Finalize() is the last method call on an object in its lifetime; after that, the object vanishes. The Finalize method has got a pet name: the finalizer. You can do your cleanup in your finalizer. Again, there is a caveat, caveat, caveat!!!

The Finalize method is called at an arbitrary point in time and on an arbitrary thread. There is no order in which the finalizers are called. If object A contains object B, it is not necessary that the finalizer for object B is called first. The order is not guaranteed. Then, what good is a finalizer? Theoretically, it is for releasing unmanaged resources that the object may contain. Unmanaged objects are not collected by GC. They exist until they are deleted. I say theoretically because you may be clever and disciplined to release them in your Dispose itself (hoping that you call Dispose, and such releasing is possible in your case).

So, what do you do if you called Dispose, and also if the finalizer is called? It might be disastrous (in your case) to clean up more than once. How do you avoid redundant cleanups? There is a (design) pattern suggested by .NET programming to overcome this situation: Dispose Pattern. The idea is to prevent the finalizer being invoked if you have called Dispose. The garbage collector is exposed via the System.GC class. You can call the GC.SupressFinalize (object reference, naturally 'this' in the current case) to suppress the finalizer from being called. Following is the Dispose pattern implementation snippet:

ref class SomeClass : IDisposable
{
   // Ctors, and other methods
   public: void Dispose()
   {
      // Do clean up on the managed/unmanaged parts
      // of the object
      Dispose(true);
      // If Dispose is called, then suppress
      // from being finalized

      System::GC::SupressFinalize(this);
   }

   public: void Finalize()
   {
      // If the object is getting finalized, then
      // pass false so that any managed object access
      // is not made; since objects are finalized
      // as per their hierarchies and dependencies.
      // Hence, only unmanaged cleanup, if any.
      Dispose(false);
   }

   protected: void Dispose(bool safe2FreeManaged)
   {
      // This method will be called with safe2FreeManaged =
      // true when Object.Dispose is explicitly called,
       // in which case it is safe to access and clean up
      // managed resources.
      //
      // This method will be called with safe2FreeManaged =
      // false from the finalizer and may not be not safe to
      // even access managed resources.

      if (safe2FreeManaged)
      {
         InternalDispose();
      }
      // Code to release unmanaged resources - COM objects etc
   }

   private: void InternalDispose()
   {
      // Actual clean up happens here !!!
   }
};

The above theoretical implementation syntax is C++/CLI, but in real time, it is done in a bit different way. That is what I discuss next. All that I discussed above is the general principle of .NET. It is the behavior exhibited by any managed object written in any language supported on the .NET platform, although each language can cloud and disguise the principle as fits the language. C++/CLI is clever and smart in this case. In C++/CLI, it is not required to explicitly derive from System::IDisposable and implement the Dispose method. Instead, the conventional destructor syntax is the analogous to the InternalDispose method. When you implement a destructor using the conventional C++ destructor syntax (~ClassName), the compiler automatically derives the class from System::IDisposable, implements the Dispose pattern for you, and assumes the destructor of the class as the cleanup code. If there is no destructor for a class, it is not derived from System.IDisposable, and no Dispose pattern. C++ programmers do not have to feel the impact of the heavy discussion of the principle or pattern above. All that was said was to know behind the scenes. The managed class looks the same way as an unmanaged class in that aspect, and the destructor is invoked when an object goes out of scope. Following is the way C++/CLI takes care of implementing the dispose pattern for you:

// This method is called by the compiler implemented
// IDisposable::Dipose and Object::Finalize methods.

Dispose(bool safe2FreeMgd)
{
   if(safe2FreeMgd)
   {
      try
      {
            //call the dtor code (~ClassName)
      }
      finally
      {
         // This call to the GC will suppress the class from
         // getting finalized
         GC::SuppressFinalize(this);
      }
   }
   else
   {
      // Call the finalizer code (!ClassName)
   }

   // Call BaseClass::Dispose(safe2FreeMgd);
}

Now take a look at this:

System::Void SomeMethod(if you need parameters)
{
   Directory dirObj;
   dirObj.DeleteFile("System.TXT");
   // Call other methods you need.........
   // dirObj's dtor (or rather Dispose) is called when scope ends
}

That is another way of declaring/allocating an instance of Directory class. It resembles the conventional stack allocated object creation in C++. In this case, the syntax is the same but actually the object (referred by dirObj) is allocated on the managed heap. And, when dirObj goes out of scope, the compiler inserts the call to the Dispose method or the destructor of the class. Notice that there is no cap (^) for the dirObj declaration. Also, notice that the members are accessed by a . (dot) operator instead of a -> operator. This gives a picture as if the object is allocated on the stack. But remember, no reference type object is allocated on the stack. Cool, and ultimately cool. This is one of the cool features that provide backward compatibility for the syntax, and it shows that the language designers have respect for the habits of C++ programmers.

Oh...I forgot the finalizer. The finalizer is declared with a ! (instead of a ~) followed by the class name. The following code shows a finalizer:

ref class SomeClass
{
   // Don't have to say, but this is the finalizer for the class
   !SomeClass()
   {
      // Do some cleanup, only for unmanaged resources.
   }
};

And the finalizer is called only if the destructor is not called. Before I move on to the next topic, a kind advice: Do not rely on a finalizer unless and until it is going to save your life. Besides the heavy performance impact that a finalizer makes, there are some very ill effects that it brings; these effects are not part of this article.

Mixed Mode

Mixed mode programming is the absolute power of C++/CLI, and so is C++/CLI the superior and mightiest of the programming languages. C++/CLI is to C++ as it is to C. You can do C programming in C++. In the same sense, you can do unmanaged C++ programming in C++/CLI without using any of the managed features, not even a managed class. I would imagine the reason to do that sort of a thing for the rest of my life. Also, you can do pure managed programming without using any of the unmanaged practices. You also can do mixed mode programming, which means you can write an application that has both managed and unmanaged classes interacting with each other. So, a managed object can contain or interact with an unmanaged object and vice versa. Could you imagine the power of programming that you can unleash with C++/CLI? The simplest application of the above power is when you want to port your existing hi-fi image processing or math library written in C++ to work on .NET platform, and when you have not enough budget/time to rewrite in C# (or VB.NET, would you try that?), you can recompile your existing code with C++/CLI and write a (managed) wrapper so that they can be used by any .NET programming language. It does not take much effort to write a wrapper when you compare the effort of rewriting and testing it. Following is a managed class that interacts with an unmanaged object:

ref class ManagedClass
{
   private: UnmanagedClass *unmgdPtr;

   public: ManagedClass(UnmanagedClass *unmgdClassPtr)
   {
      // similar to _ASSERTE(..); see nullptr usage.
      System::Diagnostics::Debug::Assert(unmgdClassPtr != nullptr);
      this->unmgdPtr = unmgdClassPtr;
   }

   public:    // Some methods that use the unmgdPtr
};

class UnmanagedClass
{
   public: UnmanagedClass()
   {
   }

   public: void SomeUnmgdMethod()
   {
   }

   // Imagine a ton of other public methods

};

Likewise, an unmanaged class can bear a managed reference as its member and can invoke methods on it. But, like a managed class holding the pointer to unmanaged, it cannot directly have the reference; instead, it is the following way:

// ref class Managed Class - Some managed class
class UnmanagedClass
{
   private: gcroot<ManagedClass^> mgdRef;
   public: UnmanagedClass(ManagedClass^ mgdClassRef)
   {
      Debug::Assert(mgdClassRef != nullptr);
      this->mgdRef = mgdClassRef;
   }

   // Methods that use mgdRef and invoke methods:
   // mgd->SomeMethod();

};

gcroot is itself an unmanaged entity that has the ability to point to or refer to managed entities. So, an instance of gcroot<managed> can be a statically or dynamically allocated member inside the unmanaged class.

C++/CLI Primer

Equality and Identity

Two managed objects are said to be equal if their values are same. The System::Object's Equals method can be used to test equivalence. Equals is an instance virtual method and can be overridden in a derived class/struct because the equality of compound objects depends on the type. Two managed objects are said to be identical if their references point to the same object on the heap. The System::Object's ReferenceEquals static method can be used test identity.

The crux of the CLI is the importance of a type of an object. Unlike unmanaged objects, managed objects know who they are right from the moment they spring to life either on the stack or heap. The type information of any type can be obtained by using the typeid operator (TypeName::typeid) and using the System::Object's GetType method for the instances. The importance of the type can be realized if you try the GetType in the constructor. You will be stunned to realize it returns the type of the instance being constructed. For instance, in the following case:

ref class SomeClass
{
   public: int X;
   public: int Y;
   public: SomeClass(int x, int y)
   {
      Console::WriteLine("Type - {0}",
         this->GetType()->ToString());

      Method();
   }
   public: virtual void Method()
   {
      Console::WriteLine("SomeClass::Method");
   }
};

ref class SomeOtherClass : public SomeClass
{
   public: SomeOtherClass(int x, int y) : SomeClass(x, y)
   {
   }
   public: virtual void Method() override
   {
      Console::WriteLine("SomeOtherClass::Method");
   }
};

the highlighted Console::WriteLine will output the type of the instance being created and not SomeClass always. That is, if an instance of SomeOtherClass is created, you will see SomeOtherClass in the output. Also, you will be thrilled to know that the virtual calls in the constructor are directed to the appropriate overrides. This, of course, is not recommended usage and is not a good discipline. It is just being pointed to understand the importance of a Type.

Declaring Properties

There is an easier and very elegant way (for the user) in C++/CLI for writing get/set methods. A Property is a getter and/or setter construct exposed on a class. The accessibility of the getter and setter of the property can be chosen as per the needs. For instance, it is possible to write a property that has a public getter but private or protected setter.

Say you have a Status class and it has a few parameters, some of which are writable, some only readable, and some both readable and writable. Following is the code snippet for the above assumption:

public ref class Status
{
   private: float pressureValue;
   private: int temperatureValue;
   private: DateTime recordDateTime;

   public: Status()
   {
      this->RecordTime = DateTime::Now;
   }

   // This property value is readable and writable
   public: property float Pressure
   {
      float get()
      {
         return this->pressureValue;
      }

      void set(float pval)
      {
         // Do checks on pval, if required
         this->pressureValue = pval;
      }
   }

   // This property value is readable, but writable only
   // derived classes

   public: property float Temperature
   {
      float get()
      {
         return this->temperatureValue;
      }

      protected: void set(float tval)
      {
         this->temperatureValue = tval;
      }
   }

   // This property is read-only [writable within the class]
   public: property DateTime RecordTime
   {
      DateTime get()
      {
         return this->recordDateTime;
      }

      private: void set(DateTime dtval)
      {
         this->recordDateTime = dtval;
      }
   }
};

Users of the Status class write code as shown below:

ref class UserClass
{
   // Create the Status class object (statusObject) in one of
   // the methods.

   public: void LogPressure()
   {
      Console::WriteLine("Pressure: {0}", statusObject->Pressure);
   }
   public: void SetPressure(float pval)
   {
      statusObject->Pressure = pval;
   }
};

Properties are an elegant way to read and write data members of a class. When using properties, the client code seems to have close connection with class exposing them, as if accessing the data members. Properties can be declared on a class, struct, or interface. So, they can be virtual—either get or set or both. Properties can be static too, and the static applies to the property as a whole.

Besides, there is something called an Indexed property. It is essentially a property that provides an indexing operator for the class. The indexing can be multi-dimensional. For instance, consider a class named Manager that has an array of Reportee(s) as a member:

public ref class Reportee
{
   private: String^ reporteeName;
   public: Reportee(String^ name)
      {
         this->Name = name;
      }

   public: property String^ Name
      {
         String^ get()
         {
            return this->reporteeName;
         }

         private: void set(String^ name)
            {
               this->reporteeName = name;
            }
         }
};

public ref class Manager
{
   // Assume this to be populated in the ctor by reading from
   // the config

   private: cli::array<reportee^>^ reporteeList;

   // Readable Indexed property [writable only by derived classes].
   // Indexed property must have the name as default. It can take
   // any type as a parameter that will be used as a index to fetch
   // the corresponding value from one of the data structures that
   // is a member of the class like reporteeList here in this class.

   public: property Reportee^ default[int]
   {
      Reportee^ get(int index)
      {
         if (index >= 0 && index < reporteeList->Length)
         {
            /*return this->reporteeList->GetValue(index);*/
            return this->reporteeList[index];
         }
         return nullptr;
      }

      protected: void set(int index, Reportee^ robj)
         {
         if (index >= 0 && index < reporteeList->Length
             && robj != nullptr)
         {
            this->reporteeList[index] = robj;
         }
      }
   }

   // A readonly non-indexed property
   public: property int ReporteesCount
   {
      int get()
      {
         return this->reporteeList->Length;
      }
   }
   // Other methods
};

Users of the Manager class write code as shown below:

ref class SomeUserClass
{
   public: void LogReporteeInfo(Manager^ mgr)
   {
      for (int i =0; i < mgr->ReporteesCount; ++i)
      {
         // Using the indexed property on Manager

         Console::WriteLine("Reportee {0}: {1}", i + 1,
                            mgr[i]->Name);
      }
   }
};

With the use of properties, methods like GetSomeValue and SetSomeValue(Value) sort of methods are replaced by short, sweet, and elegant obj->Get and obj->Set = value syntax. It is very much recommended that properties be used only for getting and setting the corresponding entity of the class, and avoid other unrelated operations.

C++/CLI Primer

enums

The following is a typical declaration of a managed enumeration:

enum class Color
{
   Black,
   Red,
   Blue,
   Green
}

The first thing about managed enumeration that differentiates from the unmanaged enums is that managed enumerations must have names. Anonymous managed enums are not supported. The other important distinguishing thing is that managed enums are scoped; this means that values must be accessed by using their enclosing enum name; two enums can have the same value name. The default underlying type of an enumeration is integer but of course that can be chosen among signed and unsigned integers (int, short, long), char, or bool. Following is a sample showing an enumeration whose underlying type is bool:

enum class Response : bool
{
   Positive = true,
   Negative = false,
   OK       = true,
   Cancel   = false,
   Yes,
   No
}

Strings

There has never been a type for string literals in C++. For instance, the type of 2 is int, the type of 's' is char. But, there is no inherent type for "Hello World" in the language. It can be accessed as char * or const char *. But, that is not the type of the string literal. In the later years of its evolution, the language provided the efficient and easy STL which has a std::string class for creating and managing strings. Even then, std::string is not its type. So, when "Hello World" is passed as an argument for the method

int StringTest(std::string);

it requires a conversion (using the ctor). All I am trying to say is that there is no inherent type for a string literal in C++, unlike in C# where the System.String is the type for string literals, methods can be invoked on string literals directly—"Hello World".Length gives 12.

C++/CLI, being new and state-of-the-art, might give you hope for what was not available in the language for years. Your eager expectations—System::String as type for string literals—have to be given up. I am sorry to disappoint you. Still, there is no type for string literals. But, because C++/CLI is a secular (managed/unmanaged) programming language, there are some interesting things to be noted.

String literals in C++/CLI have the flexibility of associating themselves with (the closest) managed or unmanaged types based on the context, and of course, managed types given the priority. So, "Hello World" can be treated as System::String or const char * or char *. Learn that with an example:

int StringTest(const char *);
int StringTest(System::Object^ strObject);
int StringTest(System::String^ clrString);
int StringTest(std::string stdString);

and guess to which of the above methods will the following call bind?

StringTest("Hello World");

The above call will bind to the System::String^ overload. As I said earlier, managed types are given higher priorities in string contexts. In the absence of the System::String^ overload, the call will be bound to the overload with System::Object^ as an argument. And, the unmanaged const char * will be considered in the absence of both the managed types. Besides that, even among managed types, only those that are found closest to adopting string literals are considered; when none are found compatible, const char * overload takes precedence. And, types that require conversion (using conversion operators or explicit ctors) are the last options; this is the case with the std::string overload.

So, what do you think will happen with the following line of code—compilation error, run-time error, runs fine?

int hc = "Hello World"->GetHashCode();

Guesses aside, this line of code will result in a compilation error. No, don't try to replace the -> with . (dot). It is not that. The compiler finds no context like a method call to match the type of the string literal to an existing type; this should convince you that there is no inherent compiler type for string literals. Period. All the different flavors of type matching for string literals may help you build a C++ world where "Hello World"s are one day System::String. So, try to write code (as much) that binds to System::String.

Arrays—not [] but cli::array^

The second relief for C++ programmers is in maintaining arrays. Arrays in the past were not a type, and had to be managed by the programmers themselves. So, the programmer had to be aware of the array boundaries, range check during access, and such things. An array type comes with C++/CLI. It is a type, and any managed array is an instance of the cli::array class, which by itself is a reference type. It can hold a fixed number of value or reference types; fixed refers that the size of the array is determined at creation time and cannot be changed after creating. Following are the typical ways of allocating an array of integers:

// Allocates the array of size 10 initialized with zeroes
cli::array<int>^ intArray = gcnew cli::array10);


// Allocates an array of size 10 and initializes the first five
// elements but can initialize all too
cli::array
<int>^ intArray = gcnew cli::array(10) { 0, 1, 2 , 3, 4 };

// Allocates an array of size 10 initialized with zeroes, later
// fills them with some values
cli::array<int>^ intArray = gcnew cli::array^(10);
for (int i = 0; i < 10; ++i)
{
   intArray[i] = i + 1;
}

// An array of reference types
// Note the ^ for the type held by the array.

array<SomeRefType^>^ arrayOfRefs =
   gcnew cli::array<SomeRefType^>(10);
for (int i = 0; i < 10; ++i)
{
    arrayOfRefs[i] = gcnew SomeRefType();
}
  • The individual values of an array are boxed if they are value types.
  • Array indices are zero based.
  • The array type has methods for accessing and manipulating the contents of the array.
  • All operations on the array are bound checked. Any access beyond the maximum size of the array results in an exception—"Index out of range".
  • Arrays get allocated only on the heap; hence, an array of value types gets all its values boxed to the heap.
Note: cli::array in C++/CLI is the emissary of the Array type in BCL. For dynamically growing arrays, use System.Collections.ArrayList or any of the generic collections in the BCL.

interior_ptr

The GC follows a contiguous mode allocation pattern for allocating memory. Compaction occurs (just like a disk defragmenter) whenever GC reclaims memory from garbage objects. Doing so changes the addresses of the objects that escaped the collection. But, the GC updates the already existing live references to point to the newly moved locations. Well, such an update cannot be made on a native pointer. You require a pointer like that, but it's superset. That means it must be able to point to a native or managed object, with a seamless syntax. It must allow all operations, arithmetic too, if it points to a native object. The dream has come true. We have the interior_ptr.

  • no?=. An interior_ptr can point to a member of a reference type, element of a managed array, or any native object compatible with a native pointer.
  • no?=. An interior pointer can only be declared on the stack. So, it cannot be declared as a member of a class. It can be a local variable or method parameter.
  • no?=. A method with interior_ptr, instead of an equivalent native counterpart, has the advantage of a seamless syntax, and works the same way.

Following is an example of using an interior_ptr:

ref class MgdClass
{
   public: int dmNumber;
};

class UnmgdClass
{
   public: int dmNumber;
};

void UserMethod()
{
   MgdClass^ objRef = gcnew MgdClass();
   interior_ptr<mgdclass^> ip1 = &objRef;
   (*ip1)->dmNumber = 100;

   interior_ptr ip2 = &(objRef->dmNumber);
   *ip2 = 200;

   // ip1 and ip2 are valid even after memory compaction.

   UnmgdClass *umObjRef = new UnmgdClass();
   interior_ptr<unmgdclass> ip3 = umObjRef;
   ip3->dmNumber = 500;

   interior_ptr ip4 = &(umObjRef->dmNumber);
   *ip4 = 600;

   int num = 1000;
   interior_ptr ip5 = &num;
   *ip5 = 200;
}

A method that takes an interior_ptr as a parameter instead of a raw pointer will have the flexibility to accept any of the interior_ptr(s) declared above. Did you experience the seamless syntax there?

Generics

Like templates for C++, so are generics for C++/CLI. But, generics is a feature of the CLR, and C++/CLI has its own syntax (as with C# and VB.NET) to expose it. The prime difference between templates and generics is that templates are compile time features and generics are runtime features. That means template classes and methods, at compile time, are converted into actual executable code (classes/methods) based on the types they are instantiated on. So, it is just a syntax for avoiding proliferate code. Template classes and methods are not identified as declared in code at runtime; instead, they have compiler generated names. But generics, though providing the facility of the templates, are independent types themselves. The point of time of instantiation of a generic type is runtime. Until then, it exists in the assembly one among the several types. Also, it can be exposed for the outside to be used. That means, unlike templates, generic types exist even no code in the assembly uses them at the time of compilation. Because it is worth writing a book on this section, I shall conclude with an example generic class:

public ref struct Stack
{
private: System::Collections::ArrayList^ stackElements;

public: Stack(int minSize)
   {
      this->stackElements =
         gcnew System::Collections::ArrayList(minSize);
   }

public: generic<typename T> void Push(T item)
   {
      stackElements->Add(item);
   }

public: generic<typename T> T Pop()
   {
      T item = safe_cast<t>(stackElements[stackElements->Count - 1]);
      Debug::Assert(item != nullptr);

      stackElements->RemoveAt(stackElements->Count - 1);
      return item;
   }

public: property int Size
   {
      int get()
      {
         return this->stackElements->Count;
      }
   }
};

The code shown above uses generic methods. Generic classes are also possible like template classes. Client code using the Stack class:

void main()
{
   Stack^ integerStack = gcnew Stack(10);

   for (int i = 0; i < 10; ++i)
   {
      integerStack->Push(i + 1);
   }

   for (int i = 1; i < integerStack->Size; ++i)
   {
      integerStack->Pop();
   }
}

or

void main()
{
   Stack integerStack(10);

   for (int i = 0; i < 10; ++i)
   {
      integerStack.Push(i + 1);
   }

   for (int i = 1; i < integerStack.Size; ++i)
   {
      integerStack.Pop();
   }
}

What is worth mentioning is that templates and generics can co-exist. Ain't it cool? A template class can have generic classes and methods but the other way round is not possible and allowed. Imagine why. I stop here on generics; Click here to start learning more about generics.

The Beginning

Well, there is only way to conclude the article. And let me put it this way. C++/CLI is not uglier but mightier and superior. The syntax might be a bit wild and the concepts may be unconventional for a C++ programmer. But on the whole, the real power is unleashed by the programmer's capacity. What you saw in this article has brought you only to the doors of power programming on the .NET platform. There is a lot lot more and it is endless. I hope whatever I discussed here has been useful and has kindled your interest to dwell further. And for such cases, MSDN is one of the best places that I would recommend.



Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • IBM Worklight is a mobile application development platform that lets you extend your business to mobile devices. It is designed to provide an open, comprehensive platform to build, run and manage HTML5, hybrid and native mobile apps.

  • New IT trends to support worker mobility — such as VDI and BYOD — are quickly gaining interest and adoption. But just as with any new trend, there are concerns and pitfalls to avoid.  Download this paper to learn the most important considerations to keep in mind for your VDI project.

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds