.NET Back to Basics: The String Class

Last month, we went back to basics with the Int class. This month, we look at the textual equivalent, the string class.

The String & Int data types, single-handed, are the two most used data in the .NET platform. Between them, they handle about 90% of all the data we use.

Strings are nothing complex; they are just simple sequences of characters that make up words and sentences. The string class exists to allow us to chop up these sentences, replace parts of them, search them, and a whole bunch of other things that make handling strings of text easier for us as developers.

So, what can a string do?

The string class has a phenomenal amount of functionality in it, all of which is grouped into about three categories as follows:

  • String tests
  • Searching
  • String modification and building

We'll start with "String Tests."

When we talk about string tests, we are in fact talking about checking to see if a string is present, or if that string has a certain sub string contained within it.

Many developers will instantly recognize that, in this case, an appropriate test would be:

if (myname == "shawty") { }

String, like most classes that are also basic data types, implements the equality operator, allowing you to perform simple straight-forward tests like this inline in your application code.

There is, however, a number of useful specialist tests too. Consider this fragment of code:

if (myname == "" && myname == null) { }

In language terms, we are saying, IF the variable myname has no contents, that is it is empty and if it is null, that is devoid of any value, then consider our decision logic to be true.

The string class makes this much easier with a static method called 'IsNullOrEmpty'

if (String.IsNullOrEmpty(myname)) { }

To me, the string method version reads much better, and makes better sense from a syntax point of view because the name of the expression tells you the exact test you're performing.

Other tests are usually performed as an extension to the string itself. For example 'contains':

      string myname = "shawty";

      if (myname.Contains("shaw")) { }

Will pass, because 'shawty' does contain 'shaw'.

      string myname = "shawty";

      if (myname.Contains("peter")) { }

will, however, fail. Rather than looking in the string, if you're looking for prefixes or suffixes, 'StartsWith' and 'EndsWith' have your back covered.

      string myname = "Mr Peter Shaw";
      if (myname.StartsWith("Mr") &&
         myname.EndsWith("Shaw")) { }

will become true if the first two characters are equal to "Mr" AND the last four are equal to "Shaw" but won't care about any of the other contents in the string. It's also a point to note that the test IS case sensitive, so "Mr" will never match "mr." I soon will show you a way to deal with this, however.

All of these tests can also be negated, so to test if a string does NOT start with "Mr" it's as simple as prefixing the method call with an exclamation mark.

      string myname = "Mr Peter Shaw";
      if (!myname.StartsWith("Mr")) { }

The string class also contains a 'Compare' method. Compare, in this case, has 11 different versions that operate in subtly different ways, the most simple of which is

using System;

namespace StringClass
{
   class Program
   {
      static void Main(string[] args)
      {
         string stringOne = "Peter";
         string stringTwo = "shaw";
         string stringThree = "Peter";

         int compareResult = String.Compare(stringOne, stringTwo);
         Console.WriteLine("Result was : {0}", compareResult);

         compareResult = String.Compare(stringOne, stringThree);
         Console.WriteLine("Result was : {0}", compareResult);

         compareResult = String.Compare(stringTwo, stringOne);
         Console.WriteLine("Result was : {0}", compareResult);

      }
   }
}

Like the version in the int class, you'll get -1, 0, or a 1 depending on which way the decision goes.

If the second string is considered different, -1 will be the result. If the result is 0, both strings are identical, and if the first string is considered the different one, you'll get a 1.

Something along the same lines as the following:

String1
Figure 1: The output from String.Compare

Personally, I find it a bit weird doing comparisons this way, but it does have one advantage that using the standard checks don't have.

This is best explained if you look at this MSDN page.

String2
Figure 2: The MSDN Docs for compare

You'll see I've highlighted the Boolean options available on some of the overrides. If you look at the descriptions, you'll see the entries I've highlighted all state that they can be told to ignore the case of the strings being compared.

The other parameters, of which there are many (and for which I would encourage you to read the MSDN docs fully), allow you to use things like culture-specific information (so you can accurately compare things like currency and date formats) or to start the conversion from a given offset in the source or target strings.

It's very easy to put together your own extra methods which emulate 'StartsWith', 'EndsWith', and 'Contains' but which are case insensitive. The key to getting it right is just a little experimentation.

Searching within a string generally comes in two flavours. The first is via the 'IndexOf', 'LastIndexOf', and the 'Substring' methods.

Some of you will undoubtedly be sitting there thinking what? How does IndexOf and Substring constitute searching?

Well, technically, many of you might be correct in thinking that IndexOf is really a "String Test" and Substring is well modification to some. When you use them together, however, you use one to find the start of your search, and the second to extract it.

This, however, is purely an academic point, which to me makes sense, and because I never use one without the other, is my preferred way of working.

So how do you search…? Quite easily.

using System;

namespace StringClass
{
   class Program
   {
      static void Main(string[] args)
      {
         string stringOne = "Peter 'Shawty' Shaw";
         string stringToSearchFor = "Shawty";

         int searchPosition = stringOne.IndexOf("Shawty", 0);
         string foundString = stringOne.Substring(searchPosition,
            stringToSearchFor.Length);

         Console.WriteLine("Found {0} at position {1}",
            foundString, searchPosition);

      }
   }
}

StringOne is the string to search in, and we use IndexOf on that string to find the index of the string to search for. We then use a substring starting at the position the string to search for was located at and for the length of the string we wish to search for.

The second method we have of searching through strings is to use regular expressions. The regex class, however, is a complete class in itself, so we'll devote an entire post to that, at a later date.

That brings us to the final group of functionality, "String Modification."

Because the string class supports arithmetic operators, you easily can join two string using a + as follows:

string name = "Peter " + "Shaw";

Which will result in 'Peter Shaw' in one string. You also can use the concat method.

string name = String.Concat("Peter ", "Shaw");

Concat doesn't offer anything over the plus operator, and it only goes up to four parameters, whereas + is infinite. Where it does throw a lifeline, however, is with string lists.

using System;
using System.Collections.Generic;

namespace StringClass
{
   class Program
   {
      static void Main(string[] args)
      {

         List<string> myStrings = new List<string>
         {
            "Peter ",
            "'Shawty' ",
            "Shaw ",
            "With DOT-NET ",
            "Nuts & Bolts"
          };

         string text = String.Concat(myStrings);

         Console.WriteLine(text);

      }
   }
}

Which should give you:

String3
Figure 3: Output from string list program

When used this way, the list length can be practically endless, allowing you to get a large list of strings and combine them to one string.

String.Format comes in handy when you want to use a template string and insert sub strings in place holders. You've already seen this done elsewhere in this post.

using System;

namespace StringClass
{
   class Program
   {
      static void Main(string[] args)
      {

         string stringOne = "Peter";
         string stringTwo = "'Shawty'";
         string stringThree = "Shaw";
         string stringFour = "DOT-NET";
         string stringFive = "Nuts & Bolts";

         string text = String.Format("My name is {0} {1} {2} ,
               welcome to {3} {4}",
            stringOne,
            stringTwo,
            stringThree,
            stringFour,
            stringFive);

         Console.WriteLine(text);

      }
   }
}

Many other system routines accept strings in this "format token manner;" for example: the Console.Write and WriteLine methods, which means you don't have to use String.Format directly.

String.Join, like concat, can take a list of strings, but this time it joins them using a known separator:

using System;
using System.Collections.Generic;

namespace StringClass
{
   class Program
   {
      static void Main(string[] args)
      {

         List<string> myStrings = new List<string>
         {
            "Peter ",
            "'Shawty' ",
            "Shaw ",
            "With DOT-NET ",
            "Nuts & Bolts"
         };

         string text = String.Join("|", myStrings);

         Console.WriteLine(text);

      }
   }
}

As you can see, the first parameter to this is the Pipe character. When you run the program, you should see that all the items in the list have been concatenated with a pipe character between them.

String4
Figure 4: All string items have been concatenated, with a pipe character added

There's much more the string class can do, but for now I've hit the word limit on this months post. We may have to revisit this again soon.

Have Fun!

Shawty



About the Author

Peter Shaw

As an early adopter of IT back in the late 1970s to early 1980s, I started out with a humble little 1KB Sinclair ZX81 home computer. Within a very short space of time, this small 1KB machine became a 16KB Tandy TRS-80, followed by an Acorn Electron and, eventually, after going through many other different machines, a 4MB, ARM-powered Acorn A5000. After leaving school and getting involved with DOS-based PCs, I went on to train in many different disciplines in the computer networking and communications industries. After returning to university in the mid-1990s and gaining a Bachelor of Science in Computing for Industry, I now run my own consulting business in the northeast of England called Digital Solutions Computer Software, Ltd. I advise clients at both a hardware and software level in many different IT disciplines, covering a wide range of domain-specific knowledge—from mobile communications and networks right through to geographic information systems and banking and finance.

Related Articles

Comments

  • Ghost

    Posted by Wyze Wildfire on 02/15/2016 07:50am

    "considered different" vs. "relative position in the sort order" Thank you for the breakdown of information. I noticed that in the middle of your article you wrote the following: "Like the version in the int class, you'll get -1, 0, or a 1 depending on which way the decision goes. If the second string is considered different, -1 will be the result. If the result is 0, both strings are identical, and if the first string is considered the different one, you'll get a 1." This one little piece of information bugged me as lacking definition in that it is not explained what defines the decision as to which one is "considered different" as you yourself put it. So I ran some tests, and I believe a more accurate statement would be which one is returned first alphabetically or as Microsoft has stated in their Description of this specific use of this specific method on the page that you have referenced and linked to, "Compares two specified String objects and returns an integer that indicates their relative position in the sort order." So a more clear statement would be: "If the first string would be returned first in an alphabetical sort order then the returning integer = -1 whereas if the first string would be returned second in an alphabetical sort order then the resulting integer returned = 1 For anybody else curious about the defining logic as to how which string is "considered different". Thanks again for the article.

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date