C++/CLI Primer
Introduction
This article is not an extensive guide to master the C++/CLI programming language; rather, it is a quick start learning feature that offers an easier way for an unmanaged C++ programmer to enter the world of managed programming and still stick to C++. I hope that this article would prove useful for a C# or VB.NET or a pure managed programmer too to program in C++/CLI where the two programming worlds merge to offer the most powerful environment for programming.
Comparisons of C++/CLI with C#, VB.NET, or other .NET languages have almost not been made but if so, they are not made to win arguments but to show the differences, understand, and appreciate gotchas and subtleties. There are absolutely no references in this article to the (obsolete) Managed C++. So, jump in!!!
Words Of Agreement
The word "unmanaged" in a the broader sense encompasses any and all technologies (Win32, COM....) and programming languages (C++, VB, Pascal.....) prior to the inception of .NET. The word "managed" refers to the .NET technology itself and only those programming languages that support programming on the .NET platform. The words 'object' and 'instance' have been used interchangeably for the managed object.
.NET refers to or is the programming technology, platform, and standard. CLR (Common Language Runtime) is the implementation of .NET and is the runtime engine (platform) that programming languages generate IL (intermediate code) to get hosted against. CLR is the virtual processor that executes the IL generated by the various different programming languages available for programming on the .NET platform. C++/CLI is the one (superior one) of them. The article in its entirety is an attempt to start learning the same.
For the content of this article, C++ means the ANSI-ISO C++ (originally conceived by Bjarne Stroustrup). It is for programming in the unmanaged world, and cannot be used to program on the .NET platform. C++/CLI is not the same, and the article will delve in more detail about that. It must be considered as an entirely different language whose subset is the features and facilities of the ANSI-ISO C++. For the content of this article, unmanaged refers to programming through C++, although it is equivalent to do through any of the other unmanaged programming languages.
Unmanaged Programming Brief
You have to reap what you sow. I mean, in C++ (the unmanaged world), if you allocate memory by new/malloc, it is your responsibility to deallocate them using delete/free. Forgetting to deallocate the allocated memory after the formal consumption results in memory leaks. The compiler is tightly bound to the underlying operating system/hardware and uses the APIs exposed by the underlying OS for programming.
Managed Programming Brief
Programming in the managed world comprises the programming language used, the libraries (called the Base Class Library), and the CLR itself. The BCL is the gateway to the platform on which the program will be executed. The BCL provides all the APIs for programming, and are organized under various namespaces corresponding to the service intended—file system, memory, network, user interface, process and threads, and so forth. One of the several facilities in the managed programming is automatic memory management—allocation is the wish, de-allocation is automatically taken care of by the CLR by a process called Garbage Collection.
Types in the managed world are entities that bear information and on which operations are carried out by calling methods. Each type is unique by itself. To use the types, you create instances of types and work with it. Types (and their associated operations) are packaged and deployed as assemblies. An assembly is the ultimate unit of deployment and is the building block of a CLR-based application. An assembly is versioned; the version serves as its identity. An assembly is similar to the dynamic link library for the unmanaged world, although assemblies are themselves dynamic link libraries or executables. Types packaged in an assembly are accessible from outside based on the accessibility marked for the type. For instance, a class type marked public is accessible from outside, and so are its methods that are marked public.
What Is C++/CLI?
I know that might sound a boring start. But, C++/CLI needs a formal introduction. ANSI/ISO C++ is the programming language for programming on Windows. .NET is a new platform that is not tied to the hardware unlike before. It has its own execution engine, a virtual processor, which is the CLR. Although C++ generates an executable for the target platform, the managed programming languages generate IL (Intermediate Language) code for the CLR. Programming languages are required to be complaint with CLI (Common Language Infrastructure) and CTS (Common Type System) to be used for programming in the managed world.
ANSI/ISO C++ cannot be used to program on the .NET platform because it is not compliant with the CLI/CTS. Hence, C++/CLI: It is a new language (like C++ for C) that has been invented to program on the .NET platform. Although the syntax, grammar, and some of the rules are the same as C++, it must not be considered just an extension over C++. Instead, C++ is a subset of C++/CLI, which is not the ultimate intent of the invention.
C++/CLI is a secular programming language; this means it can be used for managed or unmanaged or mixed programming. Hence, legacy code that cannot be ported to the .NET platform (using C# or any other .NET language of choice) in a short time span can be ported easily with C++/CLI. Also, any new code in such legacy C++ projects can be written as pure managed code. It also bridges the gap for the pure managed languages that otherwise are handicapped in using unmanaged code without C++/CLI. So, your C# project can now use your complex algorithms or bunch of hi-fi utilities written in ANSI C++ just with a C++/CLI wrapper over them.
Types and Object Creation
There are three types of data types in C++/CLI: reference, value and nativ. Native types are those that already exist with C++, say int, float, class, struct, and so on. An instance of these types is allocated on the stack when created statically. When created dynamically (using the new keyword), they get allocated on the heap. It is the responsibility of the programmer to delete the allocated instance. Now, you as a C++ programmer might be well aware of the consequences if you fail to delete. So scary.... Memory leaks!!!
Value Types and Reference Types are a part of the managed world. They behave as the CLI dictates, the prime doctrine being to have a common base type—(System::Object). Following are the methods exposed by System::Object:
| Method Name | Return Type | Accessibility |
|---|---|---|
| Equals | bool | public |
| GetType | Type | public |
| ToString | System::String^ | public |
| GetHashCode | int | public |
| Finalize | - | protected |
| MembewiseClone | System::Object^ | protected |
| ReferenceEquals | bool | public static |
From a quick look, it can be understood that this is information required to abstract any type. And so is every type derived from System::Object.
Value Types are derived from System::ValueType, which is further derived from System::Object. The value types are always allocated on the stack, although there are times when they are transported to the heap. I will deal that later. But, the very nature of a value type is to get allocated on the stack. All primitive types and structs are value types. They bear certain similarities with the primitive native types, but they are not the same.
Primitive Types Mapping
(List not extensive)
| Data Type Name | Type | Keyword |
|---|---|---|
| Integer | System.Int32 | int |
| Double | System.Double | double |
| Character [2 bytes] | System.Char | char |
| Character [1 Byte] | System.Byte | byte |
| Boolean | System.Boolean | bool |
Following is the way primitive value types are declared and used:
void InSomeMethod()
{
// You may also use the int keyword instead.
System::Int32 oddNumber = 1;
CallAnotherMethod(oddNumber);
// You may also use the char keyword instead.
System::Char character = 'A';
CallMethod3(character);
}
Following is the way user defined value types are declared and used:
ref struct DateTimeInfo
{
private: System::Int32 Year;
private: System::char Month;
private: System::char Date;
// NOTE: Cannot declare default ctor in structs. public:
DateTimeInfo(int year,
char mon, char date) { }
public: int GetYear() {
return this->Year;
} public: int
GetMonth() { return static_cast(this->Month);
} public: int
GetDate() { return static_cast(this->Date);
} }; // Well, you know how to use it !!!
Classes are the reference types. The instances of reference types are never allocated on the stack. They are always allocated on the heap, and this heap is not the same heap where your native types are allocated. This is a different area called the managed heap. Your native type or code has no idea about or direct reach to this area. So then, how do you allocate on the managed heap? Is it by using the new keyword? If so, how does the new keyword where to allocate then? To answer these, there is a newer keyword called gcnew. 'new' allocates on the native heap and gcnew allocates on the managed heap. Examples and code snippets are not appropriate yet, but just consider this for now:
reference_data_type objRef = gcnew appropraite ctor of reference_data_type
This is the conventional way of creating a managed object in C++/CLI. As said above, the instance is created on the managed heap. The accessor for that instance is called the object reference (objRef above) or handle, and it is available on the stack. I hope you can imagine that and agree. The C++/CLI convention is to call them handles. But, I am going to call them object reference, which is the term widely used in the managed world. The term object reference must not in any way be related to the C++ reference. So, the word reference in the rest of the article refers to the managed object reference only, unless and until explicitly distinguished. The instance cannot be accessed without the object reference. In essence, object references are address holders. But, they are not like native pointers. Object references are type aware, polymorphic, and exhibit the type's behavior. They are not just addresses unlike pointers. Object references cannot be cast to any type desired and moved by incrementing or decrementing the address unlike pointers. They are much more intelligent address holders unlike native pointers. There can be more than one reference for the instance on the managed heap. An assignment of object reference to other is a shallow copy. This may be news to native C++ programmers. So, what is this? There are now two references referring to the same instance on the heap and so if one gets out of scope or is deleted, the instance is scratched, leaving the other reference dangling. Typical C++ programmer's nightmare which might have been learnt the hard way.
Turn of the century...........GC !!! Yes, there can be more than one object reference referring the object on the heap. The managed programming model does not expect the programmer to do the memory reclamation. It is not required that the programmer write code such as delete objRef to deallocate and return back the memory he allocated. Spare our poor programmers. The CLR is very smart and reclaims memory through a process called Garbage Collection. The Garbage Collector reclaims only those instances that are not reachable—for which you lose the object references (like objRef above). If the object reference goes out of scope or if it is assigned null, the instance it was referring to cannot be reached through this reference anymore. So, for an instance memory to be reclaimed by the GC, there must be no outstanding references.
This is the most compelling feature of .NET. Programmers are now free of the burden to write code to delete the memory they allocate, which has been the tough schooling they have gone through in these several years of programming. But beware.......too much freedom results in chaos. Even with GC, memory has to be allocated wisely. Undisciplined allocations, for the fact that you are not responsible for deallocations, will result in poor performance of the application. This is one of the fundamental differences between the native and managed worlds. All the other concepts and rules are based on reference types, object references, and garbage collection. Besides that, there is a subtle thing to be aware of. The GC is responsible only for the deallocating the memory and not the resource. For instance, if you have opened a connection with a database, the GC is not responsible for closing the connection; instead, it is responsible only for reclaiming the memory allocated for the database object.
With the basics learnt in the previous sections, it is time to see stuff that works. Following is a snippet of a C++/CLI class:

Comments
There are no comments yet. Be the first to comment!