C++ Tutorial: The Do's and Don'ts of Accessing One Element Past the End of an Array

Introduction

Seemingly, the only safe option is to never access elements that are outside the valid boundaries of an array. However, there are cases when you need to access a memory address pointing to one element past the end of an array: when traversing an array, or in algorithms that manipulate a sequence of elements. As opposed to a common belief, C++ programming does permit access to the address of one element past the end of an array. However, you have to do that very carefully, paying attention to several important restrictions.

Sorting an Array

Accessing the address of one element past the end of an array is required more often than you think. Suppose you have an array of integers that you want to sort using the std::sort() algorithm:

  int arr[5]= {3,4,89,7,0};
  std::sort(arr, arr+5);

Sort() requires two forward iterators (recall that pointers are perfectly valid iterators): an iterator indicating the first element of a sequence, and another iterator indicating the end of the sequence. Notice that the end of a sequence is not the last valid element of the array, i.e., arr[4]. Rather, it's the address of one element past the last valid element, namely arr+5. Any attempt to dereference arr+5 would result in undefined behavior. However, the address arr+5 itself is valid for certain purposes.

Valid and Invalid Operations

Taking the address of one element past the end of an array is safe and permitted, so long as you're not using that address to read or write to the data to which the address is pointing. Additionally, you're not allowed to increment that pointer any further. However, you may decrement that pointer. Additionally, you can use that address in pointer comparison expressions, as in the following example:

   for (int *p=arr; p<arr+5; p++)
     *p=0; // clear the array
   
   Or even like this:
   
   int n=5; 
   while(n) //a more cumbersome method of clearing the array 
   {
    *(arr+5-n)=0;
    n--;
   }
   
   In contrast, dereferencing the expression arr+5 is undefined:
   
   if (*(arr+5)) //undefined behavior
    x++;
 

Generally speaking, the Standard allows you to use arr+5 only as a pointer, never as the value to which it's pointing.

  vector <int> vi;
  vi.push_back(1);
  vector<int>::iterator it= vi.end(); 
  *(--it)=8; //OK, assigns 8 to vi[0]
  ++it; //advance to one past the last valid element
  if (it==vi.end()) //OK, comparison 
   cout<<you have reached the end of the vector"<<endl;
  cout<<*it<<endl; //undefined behavior; dereferencing 

Summary

Accessing the address of one past the last element of an array is a valid operation under certain conditions. You can use that address only in pointer arithmetic expressions that access valid elements of the array, and in comparisons. You're not allowed to dereference the result nor can you increment the pointer any further (say reaching the third element past the array's end). Notice that STL containers follow this idiom. The end() member function returns an iterator pointing to one element past the last element of the container. You may use the iterator returned from end() only in comparisons and in expressions that access valid elements of the container:



Related Articles

Comments

  • clarifcations

    Posted by dankalev on 03/23/2011 04:33pm

    First, I stand behind every word in the article. 
    As for the comment: "You cannot *take the address of* anything that is not yours. What you can do is create a pointer with any value, and you can *compare* the value of the address of the one-past element to legitimate pointers." I don't see how you can compare two pointers without creating them first, and creating a pointer to one past the last element is exactly what my article says you can do -- so long as you do not dereference it. 
    "&arr[end+1]"  is not my recommendation. Rather, arr+n is, where n is the number of elements that arr has. 
    As for vector versus array: my vector example is meant to illustrate the idiom, it doesn't mean that vectors are arrays or vice versa. Furthermore, even if you can think of vectors that are not implemented as arrays, sort(vec.begin(), vec.end()) should still works. This is a requirement of the C++ standard. Finally, I don't see any difference between C++0x and C99 with respect to built-in arrays. The relevant section in the FCD is 5.7-5 which I'll quote here for your convenience:
    "[...]if the expression P points to the last element of an array object,
    the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."
    This isn't very different from 6.3.6. -8,-9 in the C99 standard.

    Reply
  • Reference?

    Posted by alanjhd08 on 03/23/2011 07:14am

    Hi, I found a reference to a quote from Bjarne Stroustroup, "' 'In C++, pointers and arrays are closely related. The name of an array can be used as a pointer to its initial element. Taking a pointer to the element one beyond the end of an array is guaranteed to work. This is important for many algorithms...", from the book  "C++ Programming Language"
    
    I realise the article above includes some mentions of vector rather than array, but are you sure that taking the address of the one-past element in an array is UB in C++?

    Reply
  • Incorrect statements

    Posted by Crazy Eddie on 03/14/2011 12:59pm

    "Taking the address of one element past the end of an array is safe and permitted, so long as you're not using that address to read or write to the data to which the address is pointing." This is a commonly held but yet false belief. You cannot *take the address of* anything that is not yours. What you can do is create a pointer with any value, and you can *compare* the value of the address of the one-past element to legitimate pointers. The type of the one-past address is the same *as-if* there was a valid array element there. The construct "&arr[end+1]" is often incorrectly seen as valid because people think as you do that taking the address of the one-past element is OK. It is OK in C99, but not in C++. Normally it will behave rationally, but it is technically UB and therefor can go wrong at any time.

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Live Event Date: December 11, 2014 @ 1:00 p.m. ET / 10:00 a.m. PT Market pressures to move more quickly and develop innovative applications are forcing organizations to rethink how they develop and release applications. The combination of public clouds and physical back-end infrastructures are a means to get applications out faster. However, these hybrid solutions complicate DevOps adoption, with application delivery pipelines that span across complex hybrid cloud and non-cloud environments. Check out this …

  • CentreCorp is a fully integrated and diversified property management and real estate service company, specializing in the "shopping center" segment, and is one of the premier retail service providers in North America. Company executives travel a great deal, carrying a number of traveling laptops with critical current business data, and no easy way to back up to the network outside the office. Read this case study to learn how CentreCorp implemented a suite of business continuity services that included …

Most Popular Programming Stories

More for Developers

RSS Feeds