How to Make Fewer Errors at the Code Writing Stage, Part 1

Abstract

I’ve arrived at the source code of the widely known instant messenger, Miranda IM. Together with various plugins, this is a rather large project whose size is about 950 thousand code lines in C and C++. Like any other considerable project with a long development history, it has many errors and misprints.

Introduction

Check Miranda IM C++ code

While examining defects in various applications, I often notice irregularities. With the examples of defects found in Miranda IM, I will try to formulate some recommendations that will help you to avoid many errors and misprints already at the stage of code writing.

I used the PVS-Studio 4.14 analyzer to check Miranda IM. The Miranda IM project’s code is of a high quality and its popularity just confirms this fact. I am using this messenger myself and do not have any complaints about its quality. The project is built in

Visual Studio with the Warning Level 3 (/W3) while the amount of comments makes 20% of the whole program’s source.

1. Avoid Functions Memset, Memcpy, ZeroMemory, etc.

I will start with errors that occur when using low-level functions to handle memory such as memset, memcpy, ZeroMemory, etc.

I recommend that you avoid these functions by all means. Sure, you do not have to follow this tip literally and replace all these functions with loops, but I have seen so many errors related to using these functions that I strongly advise you to be very careful with them and use them only when it is really necessary. In my opinion, there are only two cases when using these functions is grounded:

1) Processing of large arrays, i.e. in those places where you can really benefit from an optimized function algorithm, as compared to simple looping.

2) Processing a large number of small arrays. The reason for this case also lies in performance gain.

In all the other cases, you’d better try to do without them. For instance, I believe that these functions are unnecessary in such a program as Miranda. There are no resource-intensive algorithms or large arrays in it. So, using functions memset/memcpy is determined only by the convenience of writing short code. But this simplicity is very deceptive and having saved a couple of seconds while writing the code, you will spend weeks to catch this elusive memory corruption error. Let’s examine several code samples taken from the Miranda IM project.

V512 A call of the ‘memcpy’ function will lead to a buffer overflow or underflow. tabsrmm utils.cpp 1080

typedef struct _textrangew
{
  CHARRANGE chrg;
  LPWSTR lpstrText;
} TEXTRANGEW;

const wchar_t* Utils::extractURLFromRichEdit(...)
{
  ...
  ::CopyMemory(tr.lpstrText, L"mailto:", 7);
  ...
}

Only a part of the string is copied here. The error is simple, yet it remains. Most likely, there was a string earlier consisting of ‘char’. Then they switched to Unicode strings but forgot to change the constant.

If you copy strings using functions that are designed for this purpose, this error can never occur. Imagine that this code sample was written this way:

strncpy(tr.lpstrText, "mailto:", 7);

Then the programmer would not have to change number 7 when switching to Unicode strings:

wcsncpy(tr.lpstrText, L"mailto:", 7);

I am not saying that this code is ideal. But it is much better than using CopyMemory. Consider another sample.

V568 It’s odd that the argument of sizeof() operator is the ‘& ImgIndex’ expression. clist_modern modern_extraimage.cpp 302

void ExtraImage_SetAllExtraIcons(HWND hwndList,HANDLE hContact)
{
  ...
  char *(ImgIndex[64]);
  ...
  memset(&ImgIndex,0,sizeof(&ImgIndex));
  ...
}

The programmer intended to empty the array consisting of 64 pointers here. But instead, only the first item will be emptied. The same error, by the way, can be also found in another file, thanks to our favorite Copy-Paste:

V568 It’s odd that the argument of sizeof() operator is the ‘& ImgIndex’ expression. clist_mw extraimage.c 295

The correct code must look this way:

memset(&ImgIndex,0,sizeof(ImgIndex));

By the way, taking the address from the array might additionally confuse the one who is reading the code. Taking of the address here is unreasonable and the code may be rewritten this way:

memset(ImgIndex,0,sizeof(ImgIndex));

The next sample.

V568 It’s odd that the argument of sizeof() operator is the ‘& rowOptTA’ expression. clist_modern modern_rowtemplateopt.cpp 258

static ROWCELL* rowOptTA[100];

void rowOptAddContainer(HWND htree, HTREEITEM hti)
{
  ...
  ZeroMemory(rowOptTA,sizeof(&rowOptTA));
  ...
}

Again, it is the pointer’s size that is calculated instead of the array’s size. The correct expression is “sizeof(rowOptTA)”. I suggest using the following code to clear the array:

const size_t ArraySize = 100;
static ROWCELL* rowOptTA[ArraySize];
...
std::fill(rowOptTA, rowOptTA + ArraySize, nullptr);

I got used to meeting such lines which populate the code through the copy-paste method:

V568 It’s odd that the argument of sizeof() operator is the ‘& rowOptTA’ expression. clist_modern modern_rowtemplateopt.cpp 308

V568 It’s odd that the argument of sizeof() operator is the ‘& rowOptTA’ expression. clist_modern modern_rowtemplateopt.cpp 438

You think it’s all about low-level handling of arrays? No, not quite. Read further, fear and punish those who like to use memset.

V512 A call of the ‘memset’ function will lead to a buffer overflow or underflow. clist_modern modern_image_array.cpp 59

static BOOL ImageArray_Alloc(LP_IMAGE_ARRAY_DATA iad, int size)
{
  ...
  memset(&iad->nodes[iad->nodes_allocated_size],
    (size_grow - iad->nodes_allocated_size) *
       sizeof(IMAGE_ARRAY_DATA_NODE),
    0);
  ...
}

This time, the size of copied data is calculated correctly, but the second and third arguments are swapped by mistake. Consequently, 0 items are filled. This is the correct code:

memset(&iad->nodes[iad->nodes_allocated_size], 0,
  (size_grow - iad->nodes_allocated_size) *
     sizeof(IMAGE_ARRAY_DATA_NODE));

I do not know how to rewrite this code fragment in a smarter way. To be more exact, you cannot make it smart without touching other fragments and data structures.

A question arises on how to do without memset when handling such structures as OPENFILENAME:

OPENFILENAME x;
memset(&x, 0, sizeof(x));

It’s very simple. Create an emptied structure using this method:

OPENFILENAME x = { 0 };

More by Author

Must Read