.NET Back to Basics: The StringBuilder Class

Last month, we saw the 'String' class in the Back to Basics column. A close relative to the 'String' class is the 'StringBuilder' class, and, as its name suggests, it's designed mostly for the purpose of constructing large strings.

If you've been around .NET for some time now, you'll know there are no end of posts from all manner of folk, arguing over which is faster, string or stringbuilder. Myself, personally I think both have their strengths and weaknesses, and no one seems particularly faster than the other, and no I'm not about to start using a high resolution timer to show the difference of a few milliseconds :-)

Where I think the string builder has its advantages is in creating large bodies of complex or structured text, such as "Rich Text" or even HTML documents. StringBuilder also tends to be more efficient, and make better use of memory when dealing with considerably large blocks of text, such as complete documents.

StringBuilder is also a better choice for string modification, especially if you're expecting those modifications to be many and large. When you modify strings using the standard string class, for every modification you make, you create a new instance of the string with the modifications in place.

Even though this might not seem like a big deal, if you're performing high speed modifications inside loops and such like, you could in theory use a large amount more memory than you need to.

A string builder does all its edits in place, so you never use more memory than the single block being used by your string builder, and while that can grow and shrink as needed, it's only ever one block of memory.

Where string builder fails in comparison to a standard string class is in operations such as searching and indexing.

So, What Can the StringBuilder Do?

Well, let's start with what it does best: building strings.

Create a new console project and make sure you have the following code within it in your program.cs file.

using System;
using System.Text;

namespace StringBuilderClass
{
   class Program
   {
      static void Main()
      {
         StringBuilder sb = new StringBuilder();

         Console.WriteLine("Current StringBuilder length {0}",
            sb.Length);
         Console.WriteLine("Maximum StringBuilder capacity {0}",
            sb.MaxCapacity);
         Console.WriteLine("Current StringBuilder  capacity
            (Amount available before SB will resize itself) {0}",
            sb.Capacity);

      }
   }
}

Run the application, and you should see the following output:

String1
Figure 1: Output from our first StringBuilder code

String builder starts by allocating just 16 bytes when you first create it, and from that point on it allocates double its memory as it goes whenever you exceed that level.

Add the following two lines to the last bit of code (just after the 3rd console write statement):

         sb.Append("12345678901234567");
         Console.WriteLine("Current StringBuilder capacity
            (Amount available before SB will resize itself) {0}",
            sb.Capacity);

Then, run your program again.

String2
Figure 2: String builder has doubled its memory

You can see that the class has doubled its memory to twice what it was originally, because we put 17 chars into the current string. This can be a bit of a nuisance, especially if you're working with a large string, and exceed the size by 1 and then end up allocating twice the amount of memory you need.

You are, however, guaranteed to be able to use that extra space without penalty, unlike a traditional string which in the same circumstance may end up creating several duplications of the same large amount of memory before the Garbage Collector gets a chance to tidy things up for you.

The primary means of adding to a string builder is the "Append" method and, like many .NET base classes, it has a large amount of overrides.

String3
Figure 3: The overrides available for the Append method

Every basic data type you can think of has an override, which means you don't have to first convert your data "ToString" before being able to append it to your string builder instance, as the following code shows.

using System;
using System.Text;

namespace StringBuilderClass
{
   class Program
   {
      static void Main()
      {
         StringBuilder sb = new StringBuilder();

         bool myName = true;
         string theName = "Peter Shaw";
         int myAge = 21;

         sb.Append("It is ");
         sb.Append(myName);
         sb.Append(" that my name is ");
         sb.Append(theName);
         sb.Append(" and that I am ");
         sb.Append(myAge);
         sb.Append(" years of age :-)");

         Console.WriteLine(sb);

      }
   }
}

Which should give you the following:

String4
Figure 4: String builder can append many different formats without conversion

As you can imagine, this simplifies a lot of code very quickly, and very efficiently removes the need to do lots of "ToString" calls. I'll admit, when it comes to building Database statements, this is worth its weight in gold.

Next up is "AppendFormat" and things get even easier now.

AppendFormat uses the familiar {0} {1} string notation to add place holders to a string, allowing you to insert variables at the end of the statement.

Our previous code, for example, can now be re-written as:

using System;
using System.Text;

namespace StringBuilderClass
{
   class Program
   {
      static void Main()
      {
         StringBuilder sb = new StringBuilder();

         bool myName = true;
         string theName = "Peter Shaw";
         int myAge = 21;

         sb.AppendFormat("It is {0} that my name is {1} and that
            I am {2} years of age :-)", myName, theName, myAge);
         Console.WriteLine(sb);

      }
   }
}

I don't know about you, but reducing seven lines of code to on line definitely works for me.

The last "Append" function before we move on is "AppendLine" and that does exactly what it says on the box. It appends the given string, then adds a newline terminator, so

using System;
using System.Text;

namespace StringBuilderClass
{
   class Program
   {
      static void Main()
      {
         StringBuilder sb = new StringBuilder();

         sb.AppendLine("Hello");
         sb.AppendLine("World");

         Console.WriteLine(sb);

      }
   }
}

Will give you

String5
Figure 5: AppendLine appends a new line character

If you want to erase your entire string and start again, use the "Clear" method:

sb.Clear();

As well as appending, you also can insert values and strings at defined positions by using the "Insert" method, and "Insert," like append, has an override for all the simple common data types.

We could have written our previous example like so:

using System;
using System.Text;

namespace StringBuilderClass
{
   class Program
   {
      static void Main()
      {
         StringBuilder sb = new StringBuilder();

         bool myName = true;
         string theName = "Peter Shaw";
         int myAge = 21;

         sb.Append("It is  that my name is  and that I 
            am  years of age :-)");

         sb.Insert(6, myName);
         sb.Insert(27, theName);
         sb.Insert(52, myAge);

         Console.WriteLine(sb);

      }
   }
}

Which gives this output:

String6
Figure 6: Using raw insert calls

However, if you start playing with the offsets, you'll quickly find that it's very difficult to keep track of the offsets you need to use. In the preceding example, I worked out the positions I needed before inserting any data, then as soon as I added the "myAge" Boolean variable I had to go back and add 4 to all of the remaining ones to account for the increased string size.

Luckily, the string builder has your back covered here. If you add a marker to your string, similar to how the "AppendFormat" marker works, you easily can do the following:

using System;
using System.Text;

namespace StringBuilderClass
{
   class Program
   {
      static void Main()
      {
         StringBuilder sb = new StringBuilder();

         bool myName = true;
         string theName = "Peter Shaw";
         int myAge = 21;

         sb.Append("It is {0} that my name is {1}
            and that I am {2} years of age :-)");

         sb.Replace("{0}", myName.ToString());
         sb.Replace("{1}", theName);
         sb.Replace("{2}", myAge.ToString());

         Console.WriteLine(sb);

      }
   }
}

Unfortunately, "Replace" only has two overrides, one for "string" and one for "char". This means that doing things this way dictates that you have to go back to the "ToString" method on most data types.

You can, however, iterate over the "Chars" property. This will allow you to create loop structures along the length of the string builders contents, which you could in theory use to find the "index" of a given marker.

You then could use "Remove" on the string builder class, followed by insert to insert the replacement string.

There is, however, an easier way.

Performing a "StringBuilder.ToString" operation will give you access to the underlying standard string instance. This means that you can easily execute the following:

int markerOnePosition = sb.ToString().IndexOf("{0}");

MarkerOnePosition then would contain the offset of your marker, which you could then execute the following code on:

sb.Remove(markerOnePosition, 3);
sb.Insert(markerOnePosition, myName);

A full example might look as follows:

using System;
using System.Text;

namespace StringBuilderClass
{
   class Program
   {
      static void Main()
      {

         StringBuilder sb = new StringBuilder();

         bool myName = true;
         string theName = "Peter Shaw";
         int myAge = 21;

         sb.Append("It is {0} that my name is {1}
            and that I am {2} years of age :-)");

         int markerOnePosition = sb.ToString().IndexOf("{0}");
         sb.Remove(markerOnePosition, 3);
         sb.Insert(markerOnePosition, myName);

         int markerTwo = sb.ToString().IndexOf("{1}");
         sb.Remove(markerTwo, 3);
         sb.Insert(markerTwo, theName);

         int markerThreePosition = sb.ToString().IndexOf("{2}");
         sb.Remove(markerThreePosition, 3);
         sb.Insert(markerThreePosition, myAge);

         Console.WriteLine(sb);

      }
   }
}

Which, as you can see, produces exactly the same output as the previous example.

String7
Figure 7: Using raw inserts with "remove" and "string.IndexOf" to get the best of both worlds

That's all for this month. Hopefully, you can see the StringBuilder is not just a replacement for the humble string, but is designed as a class to be used alongside the string class and make string manipulation far more powerful.

String along with me and learn more about using strings in C# with the StringBuilder class.

View Article



About the Author

Peter Shaw

As an early adopter of IT back in the late 1970s to early 1980s, I started out with a humble little 1KB Sinclair ZX81 home computer. Within a very short space of time, this small 1KB machine became a 16KB Tandy TRS-80, followed by an Acorn Electron and, eventually, after going through many other different machines, a 4MB, ARM-powered Acorn A5000. After leaving school and getting involved with DOS-based PCs, I went on to train in many different disciplines in the computer networking and communications industries. After returning to university in the mid-1990s and gaining a Bachelor of Science in Computing for Industry, I now run my own consulting business in the northeast of England called Digital Solutions Computer Software, Ltd. I advise clients at both a hardware and software level in many different IT disciplines, covering a wide range of domain-specific knowledge—from mobile communications and networks right through to geographic information systems and banking and finance.

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date