Copy Constructors and Assignment Operators

WEBINAR: On-demand webcast

How to Boost Database Development Productivity on Linux, Docker, and Kubernetes with Microsoft SQL Server 2017 REGISTER >

Copy Constructors and Assignment Operators: Just Tell Me the Rules! Part I

I get asked this question sometimes from seasoned programmers who are new to C++. There are plenty of good books written on the subject, but I found no clear and concise set of rules on the Internet for those who don't want to understand every nuance of the language—and just want the facts.

Hence this article.

The purpose of copy constructors and assignment operators is easy to understand when you realize that they're always there even if you don't write them, and that they have a default behavior that you probably already understand. Every struct and class have a default copy constructor and assignment operator method. Look at a simple use of these.

Start with a struct called Rect with a few fields:

struct Rect {
   int top, left, bottom right;
};

Yes, even a struct as simple as this has a copy constructor and assignment operator. Now, look at this code:

1: Rect r1 = { 0, 0, 100, 200 };
2: Rect r2( r1 );
3: Rect r3;
4: r3 = r1;

Line 2 invokes the default copy constructor for r2, copying into it the members from r1. Line 3 does something similar, but invokes the default assignment operator of r3, copying into it the members from r1. The difference between the two is that the copy constructor of the target is invoked when the source object is passed in at the time the target is constructed, such as in line 2. The assignment operator is invoked when the target object already exists, such as on line 4.

Looking at what the default implementation produces, examine what Line 4 ends up doing:

1. r3.top    = r1.top;
2. r3.left   = r1.left;
3. r3.bottom = r1.bottom;
4. r3.right  = r1.right;

So, if the default copy constructor and assignment operators do all this for you, why would anyone implement their own? The problem with the default implementations is that a simple copy of the members may not be appropriate to clone an object. For instance, what if one of the members were a pointer that is allocated by the class? Simply copying the pointer isn't enough because now you'll have two objects that have the same pointer value, and both objects will try to free the memory associated with that pointer when they destruct. Look at an example class:

class Contact {
   char* name;
   int age;
public:
   Contact( const char* inName, inAge ) {
   name = new char[strlen( inName ) + 1];
   strcpy( name, inName );
   age = inAge;
   }
   ~Contact() {
      delete[] name;
   }
};

Now, look at some code that uses this class:

Contact c1("Fred", 40 );
Contact c2 = c1;

The problem is, c1 and c2 will have the same pointer value for the "name" field. When c2 goes out of scope, its destructor will get called and delete the memory that was allocated when c1 was constructed (because the name field of both objects have the same pointer value). Then, when c1 destructs, it will attempt to delete the pointer value, and a "double-free" occur. At best, the heap will catch the problem and report an error. At worst, the same pointer value may, by then, be allocated to another object, the delete will free the wrong memory, and this will introduce a difficult-to-find bug in the code.

The way you want to solve this is by adding an explicit copy constructor and an assignment operator to the class, like so:

Contact( const Contact& rhs ) {
   name = new char[strlen( rhs.name ) + 1];
   strcpy( name, rhs.name );
   age = rhs.age;
}

Contact& operator=( const Contact& rhs ) {
   char* tempName = new char[strlen( rhs.name ) + 1];
   delete[] name;
   name = tempName;
   strcpy( name, rhs.name );
   age = rhs.age;
   return *this;
}

Now, the code that uses the class will function properly. Note that the difference between the copy constructor and assignment operator above is that the copy constructor can assume that fields of the object have not been set yet (because the object is just being constructed). However, the assignment operator must handle the case when the fields already have valid values. The assignment operator deletes the contents of the existing string before assigning the new string. You might ask why the tempName local variable is used, and why the code isn't written as follows instead:

delete[] name;
name = new char[strlen( rhs.name ) + 1];
strcpy( name, rhs.name );
age = rhs.age;

The problem with this code is that if the new operator throws an exception, the object will be left in a bad state because the name field would have already been freed by the previous instruction. By performing all the operations that could fail first and then replacing the fields once there's no chance of an exception from occurring, the code is exception safe.

Note: The reason the assignment operator returns a reference to the object is so that code such as the following will work:

c1 = c2 = c3;

One might think that the case when an explicit copy constructor and assignment operator methods are necessary is when a class or struct contains pointer fields. This is not the case. In the case above, the explicit methods were needed because the data pointed to by the field is owned by the object. If the pointer is a "back" (or weak) pointer, or a reference to some other object that the class is not responsible for releasing, it may be perfectly valid to have more than one object share the value in a pointer field.

There are times when a class field actually refers to some entity that cannot be copied, or it does not make sense to be copied. For instance, what if the field were a handle to a file that it created? It's possible that copying the object might require that another file be created that has its own handle. But, it's also possible that more than one file cannot be created for the given object. In this case, there cannot be a valid copy constructor or assignment operator for the class. As you have seen earlier, simply not implementing them does not mean that they won't exist, because the compiler supplies the default versions when explicit versions aren't specified. The solution is to provide copy constructors and assignment operators in the class and mark them as private. As long as no code tries to copy the object, everything will work fine, but as soon as code is introduced that attempts to copy the object, the compiler will indicate an error that the copy constructor or assignment operator cannot be accessed.

To create a private copy constructor and assignment operator, one does not need to supply implementation for the methods. Simply prototyping them in the class definition is enough. Example:

private:
Contact( const Contact& rhs );
Contact& operator=( const Contact& rhs );

Some people wish that C++ did not provide an implicit copy constructor and assignment operator if one isn't provided. They automatically define a private copy constructor and assignment operator automatically when they define a new class. That way, it will prevent anyone from copying their object unless the explicitly support such an operation. This is generally a good practice.



About the Author

Kenneth Kasajian

Development

Comments

  • Some improvements: Check for this pointer and reduce duplicated code

    Posted by DaMagic on 09/11/2007 02:39pm

    class Contact 
    {
    private:
       char* name;
       int age;
    
    public:
       Contact( const char* inName, inAge );
       
       Contact( const Contact& rhs );
       Contact& operator=( const Contact& rhs );
       
       virtual ~Contact(void);
    
    protected:
       // Made protected to enable derived classes to call 
       // MemberwiseClone of this base class in their own 
       // MemberwiseClone method.
       void MemberwiseClone( const Contact& rhs );
    };
    
    
    inline Contact::Contact( const Contact& rhs )
    {
       MemberwiseClone(rhs);
    }
    
    inline Contact& Contact::operator=( const Contact& rhs )
    {
       // Check if given argument is not current object,
       // otherwise returns without any changes.
       if (this != &rhs)
       {
          MemberwiseClone(rhs);
       }
    
       return *this;
    }
    
    inline void Contact::MemberwiseClone( const Contact& rhs )
    {
       ASSERT(this != rhs);
    
       delete[] name; // Also works if name is already null.
       name = 0;
    
       name = new char[strlen( rhs.name ) + 1];
       strcpy( name, rhs.name );
    
       age = rhs.age;
    }
    
    Derived classes only clone their own additional class fields.
    Sample extract:
    
    inline ContactDerived::ContactDerived( const ContactDerived& rhs ) : Contact( rhs )
    {
       MemberwiseClone(rhs);
    }
    
    inline ContactDerived& ContactDerived::operator=( const ContactDerived& rhs )
    {
       if (this != &rhs)
       {
          // First clone data inherit from base class.
          Contact::MemberwiseClone(rhs);
    
          // Next clone own additional class fields.
          MemberwiseClone(rhs);
       }
    
       return *this;
    }

    • Some annotations

      Posted by DaMagic on 09/15/2007 09:51am

      Correctly, I forgot the initialization of the class fields in the ctor especially the 'name' pointer set to zero. But this is only a penny. It's not good to make MemberwiseClone virtual because this results in that virtual functions call virtual functions which is not good coding practice. In addition the copy-ctor cannot (or should not) call the copy-ctor of its base class within the initialization list. This is what PC-Lint for instance claims if you check your code against MISRA rules. I use the pattern (with a protected MemberwiseClone() method which clones only own class fields) in every case I need an assignment operator and copy-ctor. The pattern is very close to the one C# uses. Look at the implementation of 'object' base class in C#. The difference is that if you want a cloneabel object in C# you have to derive from ICloneableinterface and to implement the Clone() method. But this is what Copy() or operator=() means in C++. Otherwise you only have a shallow copy. The performance advantage of MemberwiseClone() is that for the copy-ctor the copy-ctor of the base class is called in the initialization list (similar to inline). operator=() also calls MemberwiseClone(). But modern compilers build inline although not explicitly declared. If not you can declare MemberwiseClone() as inline which is possible here because MemberwiseClone() is not virtual.

      Reply
    • Thanks, but I'm not sure I want to take this

      Posted by kasajian on 09/12/2007 08:38pm

      I am not sure I want to take take this code. There's a couple of problems. 1. I think the way it's written, it won't work because when the copy constructor calls MemberwiseClone, the object data members haven't been initialized yet, so calling "delete[] name". The "name" variable will likely have garbage data. You can fix that by clearing the data members in the copy constructor before calling MemberwiseClone. However, there's a problem with that, too -- see next item. Also, you have to make MemberwiseClone virtual. 2. Introducing this MemberwiseClone is fine for reducing Maintenance cost, and I'd likely do that myself in a lot of my code, but as a general example for C++, I don't agree with it. The reason is because there's an actual reason why the Copy Constructor and Assignment Operator both exist. It's for performance. When coding the Copy Constructor, you know for sure that you're working with a new object, so you don't have to worry about previous data values in the field. With the Assignment Operator, you do. By factoring out the common code, you're introducing a performance penalty. That penalty may not significant in most applications, but it was significant enough for the designers of C++ to introduce both of those constructs. Otherwise, they would have just introduce a syntax element that has the "MemberwiseClone" semantics. I prefer the simplicity of the existing code, too -- I find that it better explains to someone who doesn't quite get the nuances of how to properly code these two constructs

      Reply
    Reply
  • Typos in the class Contact dtor?

    Posted by VictorN on 08/30/2007 05:49am

    What is the Contact() method in the following example:
    
    class Contact {
       char* name;
       int age;
    public:
       Contact( const char* inName, inAge ) {
       name = new char[strlen( inName ) + 1];
       strcpy( name, inName );
       age = inAge;
       }
       Contact()  {
          Delete[] name;
       }
    };
    
    Is it the default constructor? 
    Or it is a destructor written without "tilda"?  
    It seems to be the latter.
    And what is the "Delete [] name"? Shouldn't it be:
    
      ~Contact() {
          delete[] name;
       }

    • Typos

      Posted by kasajian on 08/30/2007 08:22am

      Yes, both of those are Typos. They're the result of the code being copied into a word-processor before publishing. A corrected version is currently pending review and will soon be published.

      Reply
    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • As all sorts of data becomes available for storage, analysis and retrieval - so called 'Big Data' - there are potentially huge benefits, but equally huge challenges...
  • The agile organization needs knowledge to act on, quickly and effectively. Though many organizations are clamouring for "Big Data", not nearly as many know what to do with it...
  • Cloud-based integration solutions can be confusing. Adding to the confusion are the multiple ways IT departments can deliver such integration...

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date