Prefer std::string to char*

This article is intended for programmers who are starting C++ programming and have a background of C knowledge. Yes, it is true that C++ inherits most of C, but there are many things that should be avoided, such as preferring new and delete to malloc and free, the C++ casts to the C ones or the use of STL containers to statically or dynamically allocated arrays. I will bring into the discussion a special case of the later: preferring std::string to char*. I will take you through a list of problems with the C-like character arrays and then show how much you benefit by avoiding it and using std::string.

Avoid Ill-Formed Declarations of char Arrays

When declaring a char array, many programmers do it like this:

char* name = "marius";

That might look okay at first glance, but then you might decide you need to make the strings begin with a capital. The simplest way to implement it is:

name[0] = 'M';

And the code might build fine, but crash at run-time because that is undefined behaviour and depends on the compiler implementation (with VS2005 it compiles fine but crashes at run-time). The answer is obvious: "marius" is a string literal (stored in the data section of the program), and "name" is only a pointer to the array. Because that data area (of string literals) is read-only, you are not allowed to change it. The correct declaration should have been:

const char* name = "marius";

In this case, an attempt to change a character is detected by the compiler that throws an error: cannot modify a constant variable.

Nasty C-Like Approach

What you can do is use a char[] to define a fixed-length array of chars:

char name[] = "marius";
name[0] = 'M';

In this case, name will be an array of seven characters (including the null-terminator), initialized with the string literal "marius". It is basically an im-memory copy of the literal, but one with read/write access.

Now, imagine that you want to append the first name to the last name. You could use strcat():

char name[] = "marius";
strcat(name, " bancila");

But, as soon as you run the program you'll crash because strcat does not verify whether the buffer (name in this case) is big enough to hold the appended string too. You write beyond the array bounds and corrupt the memory.

Of course, you can work around that by declaring a bigger array, just to make sure that it can hold all that you want. For instance, 50 chars should be enough to hold a name.

char name[50] = "marius";
strcat(name, " bancila");

That works. But, you have to pray you'll not have to deal with names like Carlos Marìa Eduardo García de la Cal Fernàndez Leal Luna Delgado Galván Sanz (yes, that is a single Spanish name). You also have to think that you might waste memory. If you declare the name as 100 chars, and use on average 20, with a list of 100,000 names, you waste 8,000,000 bytes in memory. This leads you to the next approach.

Allocating Memory Dynamically

The next step in finding the appropriate solution is to try allocating the memory dynamically.

char* name = new char[strlen("marius")+1];
strcpy(name, "marius");

In this case, you can re-allocate as much memory as you need. This could be simply illustrated like below:

char* temp = new char[strlen(name) + strlen(" bancila") + 1];
strcpy(temp, name);
strcat(temp, " bancila");

delete [] name;
name = temp;

A lot of code needs to be written and maintained. And, things become even worse when you deal with classes (after all, I am talking about C++) and need to make deep copies.

Ensure Correct Memory Handling in Classes

Consider the case of a Person class that needs to store the name of a person. Your first impulse is to write it like this:

class Person
{
   char* name;
};

At first glance it could be okay, but this class should provide:

  • A constructor that takes a string to initialize the name
  • A custom copy constructor to ensure a deep copy (default copy constructor provided by the compiler makes a shallow copy; in other words, copies the value of all attributes from one object to the other, which leads to a copy of pointers, not the objects they point to)
  • A custom operator=
  • A destructor to clean up the dynamically allocated memory

After putting all that together, the Person class would look like this:

class Person
{
   char* name;
public:
   Person(const char* str)
   {
      name = new char [strlen(str)+1];
      strcpy(name, str);
   }

   Person(const Person& p)
   {
      name = new char [strlen(p.name)+1];
   }

   Person& operator=(const Person& p)
   {
      if(this != &p)
      {
         delete [] name;

         name = new char [strlen(p.name)+1];
         strcpy(name, p.name);
      }

      return *this;
   }

   ~Person()
   {
      delete [] name;
   }
};

Prefer std::string to char*

std::string Saves!

The Standard Template Library (STL) offers a class called std::string (a specialization of std::basic_string), which is a container that enables the use of strings as normal types, offering support for operations such as comparison and concatenation, iterating, STL algorithms, copying, or assignment. The class is defined in the header <string>.

The advantages offered by the use of std::string include:

  • Easy assignment, copying, or concatenation
  • std::string name = "marius";  // initialization with assignment
    name += " bancila";           // concatenation
    std::string copy = name;      // copying
    
  • Determination of the length by using one of the length() or size() methods. The two methods are identical; the second is provided for consistency with the other STL containers.
  • std::string name = "marius";
    std::cout << "length=" << name.length() << std::endl;
    std::cout << "length=" << name.size()   << std::endl;
    
  • Check for emptiness
  • std::string name;
    if(name.empty())
       std::cout << "empty string";
    
  • Support for comparison
    if(name == "marius")
    {
    }
    
    if(name.compare("marius") == 0)
    {
    }
    

    The compare method performs a case-sensitive comparison with a specified string to determine whether the two strings are equal or if one is lexicographically less than the other. Its return value has the same meaning as for strcmp(): a negative value indicates the operand string is less than the parameter string, whereas a positive one indicates that the operand string is greater; 0 indicates they are equal. Moreover, the six overloads of the function allow the comparison of only parts of the string.

    if(name.compare(0, 3, "mar") == 0)
    {
       std::cout << "match";
    }
    
  • Overloaded operators << and >> to write and read strings to and from streams
  • std::string name;
    std::cin  >> name;    // read name from console
    std::cout << name;    // write name in console
    
  • Easy access to the characters of the string
  • std::string name = "marius";
    name[0] = 'M';
    name[name.length()-1] = 'S';
    
  • Iteration over the characters; this can be done by using C-like indexing or STL iterators (const iterators should be used and the string is not modified during iteration)
  • std::string name = "marius";
    
    for(size_t i = 0; i < name.length(); ++i)
       std::cout << name[i];
    
    for(std::string::const_iterator cit = name.begin();
        cit != name.end(); ++cit)
       std::cout << *cit;
    
    for(std::string::iterator it = name.begin();
        it != name.end(); ++it)
       *it = toupper(*it);
    
  • Erasing parts of a string
  • std::string name = "marius bancila";
    // remove everything after the 6th element
    name.erase(6, name.length() - 6);
    
  • Inserting strings or characters at specified positions
  • std::string name = "marius";
    // insert at the end
    name.insert(name.length(), " bancila");
    name.insert(name.length(), 3, '!');
    
  • Inserting elements at the end of the string
  • std::string name = "marius";
    name.push_back('!');
    
  • Fast exchange of two string values
  • std::string firstname = "bancila";
    std::string lastname = "marius";
    firstname.swap(lastname);
    
    std::cout << firstname << ' ' << lastname << std::endl;
    
  • const access to the internal char array buffer with the c_str() method; you can use a std::string object with functions that take pointers to char (const or not):
  • void print(const char* name)
    {
       std::cout << name << std::endl;
    }
    
    std::string name = "marius";
    print(name.c_str());
    
    void makeupper(char* array, int len)
    {
       for(int i = 0; i < len; ++i)
          array[i] = toupper(array[i]);
    }
    
    std::string name = "marius";
    makeupper(&name[0], name.length());
    
  • Use of STL algorithms
  • std::string name = "marius";
    // make string upper
    std::transform(name.begin(), name.end(), name.begin(),
                   toupper);
    
    std::string name = "marius";
    // sort the characters ascending
    std::sort(name.begin(), name.end());
    
    std::string name = "marius";
    // reverse the string
    std::reverse(name.begin(), name.end());
    
    bool iswhitespace(char ch)
    {
       return  ch == ' ' || ch == '\t' || ch == '\v' ||
               ch == '\r' || ch == '\n';
    }
    
    std::string name = " marius  ";
    // remove white spaces
    std::string::iterator newend = std::remove_if(name.begin(),
       name.end(), iswhitespace);
    name.erase(newend);
    
  • Building strings by the use of std::stringstream, available from the header <sstream> (stringstream is not a focus of the case, so I'll only show you a simple example).
  • std::stringstream strbuilder;
    strbuilder << "1 + 1 = " << 1+1;
    std::string str = strbuilder.str();
    

Go back to the example with the Person class. If you replace the char* with a std::string, all you have to do is provide a constructor. The compiler takes care of the rest; in this case, it does the shallow copy while copying to strings. This is enough because that invokes the operator= of the std::string, which correctly copies the strings.

class Person
{
   std::string name;
public:
   Person(const std::string& str)
   {
      name = str;
   }
};

Person p1("marius"); 
// works because std::string has a constructor that takes a const
// char*

Person p2("bancila");
p1 = p2;

Conclusion

This article is not a documentation or a tutorial on the use of std::string, but a pleading for its use. Using the STL std::string instead of C-like arrays makes your code cleaner, more natural, and easier to read and maintain. You don't have to care about dynamic allocation of memory because everything is done inside the string class. You can leave aside unnecessary details (such as memory handling) and concentrate on the important aspects of programming. Just go for it!



About the Author

Marius Bancila

Marius Bancila is a Microsoft MVP for VC++. He works as a software developer for a Norwegian-based company. He is mainly focused on building desktop applications with MFC and VC#. He keeps a blog at www.mariusbancila.ro/blog, focused on Windows programming. He is the co-founder of codexpert.ro, a community for Romanian C++/VC++ programmers.

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Learn How A Global Entertainment Company Saw a 448% ROI Every business today uses software to manage systems, deliver products, and empower employees to do their jobs. But software inevitably breaks, and when it does, businesses lose money -- in the form of dissatisfied customers, missed SLAs or lost productivity. PagerDuty, an operations performance platform, solves this problem by helping operations engineers and developers more effectively manage and resolve incidents across a company's global operations. …

  • Today's agile organizations pose operations teams with a tremendous challenge: to deploy new releases to production immediately after development and testing is completed. To ensure that applications are deployed successfully, an automatic and transparent process is required. We refer to this process as Zero Touch Deployment™. This white paper reviews two approaches to Zero Touch Deployment--a script-based solution and a release automation platform. The article discusses how each can solve the key …

Most Popular Programming Stories

More for Developers

RSS Feeds