Application Modernization: What Is It and How to Get Started
You would be surprised just how often this question crops up in most of the usual places .NET developers ask questions. It arises from the use of 'string' in the C# language and the fact that Intellitype causes confusion by suggesting 'String' when using string-related functionality in a C# application.
What many people don't realise is that there really is no difference between them. As far as you need to be concerned as a .NET developer, you can freely interchange them as and when you feel like it.
Why, Then, Do We Have Two Strings?
To understand this, you need to understand exactly what 'string' (with a small s) represents. When we talk about the C# language, 'string' is actually just an alias for 'String', so when you declare variables in this manner:
string Name = "Peter";
You're really using a 'String' anyway. More specifically, you're actually using a .NET framework type called "System.String".
The reason we have an alias of 'string' in C# is because C# is an ECMA standardised language, and part of that language states that certain primitive types must be supplied by the language compiler.
Indeed 'string' is not the only alias that's defined in the C# language specification. The following are all aliases, too:
- object is an alias for System.Object
- bool is an alias for System.Boolean
- byte is an alias for System.Byte
- sbyte is an alias for System.SByte
- short is an alias for System.Int16
- ushort is an alias for System.UInt16
- int is an alias for System.Int32
- uint is an alias for System.UInt32
- long is an alias for System.Int64
- ulong is an alias for System.UInt64
- float is an alias for System.Single
- double is an alias for System.Double
- decimal is an alias for System.Decimal
- char is an alias for System.Char
The aliases are required because C# as a language has to implement these primitive data types. Otherwise, it's not considered to be following the ECMA language specification.
There's a gotcha to be aware of, though. In C#, these types make sense and have a direct one-to-one mapping. In a different language, however, they may be mapped to different data types.
Let's take a theoretical example. Imagine that you had a 16-bit embedded computer, which, unlike your large desktop machine, could not handle 64-bit numbers. On that machine, the biggest number it might be able to handle (through use of some clever maths) might be 32 bits.
The compiler for that machine could realistically then map an 'int' to an 'Int16' and a 'long' to an "Int32". Meanwhile, a 'short' could actually be mapped to a 'Byte'.
Now imagine, you develop the code for this device on your big PC, and you use 'int', 'long', and 'short' as you always have done, without really thinking about it. At some point, you might assign a value to that long that exceeds 32 bits in length, but your development PC will never flag that as an issue because a 'long' is 64 bits.
When you then compile or otherwise attempt to run that code on the 16-bit system, you might then find that your code has overflow errors or fails to compile. This is especially true if the embedded 16-bit machine has an on-demand embedded compiler.
Going back to the actual subject in question—the 'string' type—you might find that on one platform a string is defined as a null-terminated C style string, but on another platform it may be implemented as a Pascal style string, limited to 255 characters with a leading length byte.
The thing to remember here is that using aliases will ensure in most cases that code written for one compiler should, in theory, compile under any other language compiler that both understand the language being compiled and follows the spec. However, the output from those two compilers may be very, very different.
If you use explicit framework types 'System.String', 'System.Int32', and so forth, you can be 100% sure that your aliases will not cause any inadvertent type safety issues, but you will possibly need to write conditional code if the code is intended to be compiled on multiple platforms.
Which Is the Correct Terminology to Use, Then?
The generally accepted method (at least where C# is concerned) is to use 'string' in variable declarations, for example:
string myString = "Hello World";
and to use the system type when referring to operators and functions within the string class, for example:
string myString = String.Concat("hello ", "World");
If there's a .NET-related tip or trick you'd like to share, please feel free to drop me a message in the comments below, or reach out to me on Twitter where I can normally be found lurking around under the name @shawty_ds. Let me know your thoughts, questions, and ideas. You never know; you might just see them appear here.