90 Errors in Open-Source Projects

Abstract

There are actually 91 errors described in the article, but number 90 looks nicer in the title. The article is intended for C/C++ programmers, but developers working with other languages may also find it interesting.

The materials collected in this article will be useful for authors of articles, books and blogs. Examples of errors can be used to demonstrate advantages of different programming styles – for instance, why you should not try to make your code shorter by writing long expressions.

Examples of errors detected in various open-source projects

We regularly check known and little-known open-source projects. We do it with the purpose to get an opportunity to write a corresponding advertisement item and test the PVS-Studio analyzer on new code. Many readers ask if we tell projects’ authors about errors. Surely. And sometimes it happens that we get a new customer after that.

All the examples of detected errors are divided into several groups. This division is rather relative. One and the same error can be referred to misprints, vulnerabilities and incorrect array handling at a time. That’s why we have arranged the errors in different categories just to show you that the analyzer can detect a wide range of various defects.

We took only a few errors from each of the projects we have checked, of course. If we describe all the detected issues, the article will turn into a reference book. Here is a list of projects we have analyzed:

Figure 1. Logos of projects we have checked

Errors of array and string handling

Errors of array and string handling are the largest class of defects in C/C++ programs. This is the price for the capability of effective low-level memory handling available to programmers. In the article we will show just a small part of these errors found by the PVS-Studio analyzer. But we think any C/C++ programmer understands how numerous and insidious they are.

Example 1. Wolfenstein 3D project. Only part of an object is cleared.

void CG_RegisterItemVisuals( int itemNum ) {
  ...
  itemInfo_t *itemInfo;
  ...
  memset( itemInfo, 0, sizeof( &itemInfo ) );
  ...
}

The error was found through the V568 diagnostic: It’s odd that the argument of sizeof() operator is the ‘&itemInfo’ expression. cgame cg_weapons.c 1467.

The sizeof() operator calculates the size of the pointer instead of the ‘itemInfo_t’ structure’s size. It is “sizeof(*itemInfo)” that must be written.

Example 2. Wolfenstein 3D project. Only part of a matrix is cleared.

ID_INLINE mat3_t::mat3_t( float src[ 3 ][ 3 ] ) {
  memcpy( mat, src, sizeof( src ) );
}

The error was found through the V511: The sizeof() operator returns size of the pointer, and not of the array, in ‘sizeof(src)’ expression. Splines math_matrix.h 94

Usually programmers expect ‘sizeof(src)’ to return the size of an array equal to “3*3*sizeof(float)” bytes. But according to the language standard, ‘src’ is just a pointer, not an array. Thus, the matrix will be copied only partly. The ‘memcpy’ function will copy 4 or 8 bytes (the pointer size) depending on whether the code is 32-bit or 64-bit.

If you want the whole matrix to be copied, you may pass a reference to the array into the function. This is the correct code:

ID_INLINE mat3_t::mat3_t( float (&src)[3][3] )
{
  memcpy( mat, src, sizeof( src ) );
}

Example 3. FAR Manager project. Only part of an array is cleared.

struct TreeItem
{
  int *Last;
  size_t LastCount;
  ...
  void Clear()
  {
    strName.Clear();
    memset(Last, 0, sizeof(Last));
    Depth=0;
  }
};

The error was found through the V579: diagnostic The memset function receives the pointer and its size as arguments. It is probably a mistake. Inspect the third argument. far treelist.hpp 66

Most likely, there is a missing operation of multiplication by the number of items being cleared, and the code must look as follows: “memset(Last, 0, LastCount * sizeof(Last));”.

Example 4. ReactOS project. Incorrect calculation of a string length.

static const PCHAR Nv11Board = "NV11 (GeForce2) Board";
static const PCHAR Nv11Chip = "Chip Rev B2";
static const PCHAR Nv11Vendor = "NVidia Corporation";

BOOLEAN
IsVesaBiosOk(...)
{
  ...
  if (!(strncmp(Vendor, Nv11Vendor, sizeof(Nv11Vendor))) &&
      !(strncmp(Product, Nv11Board, sizeof(Nv11Board))) &&
      !(strncmp(Revision, Nv11Chip, sizeof(Nv11Chip))) &&
      (OemRevision == 0x311))
  ...
}

The error was found through the V579 diagnostic: The strncmp function receives the pointer and its size as arguments. It is probably a mistake. Inspect the third argument. vga vbe.c 57

Calls of the ‘strncmp’ function in this code compare only the first several characters, not whole strings. The error here is this: the sizeof() operator, absolutely inappropriate in this situation, is used to calculate string lengths. The sizeof() operator actually calculates the pointer size instead of the number of bytes in a string.

What is the most unpleasant and insidious about this error is that this code almost works as intended. In 99% of cases, comparison of the first several characters is enough. But the remaining 1% can bring you much fun and long debugging.

Example 5. VirtualDub project. Array overrun (explicit index).

struct ConvoluteFilterData {
 long m[9];
 long bias;
 void *dyna_func;
 DWORD dyna_size;
 DWORD dyna_old_protect;
 BOOL fClip;
};

static unsigned long __fastcall do_conv(
  unsigned long *data,
  const ConvoluteFilterData *cfd,
  long sflags, long pit)
{
  long rt0=cfd->m[9], gt0=cfd->m[9], bt0=cfd->m[9];
  ...
}

The error was found through the V557 diagnostic: Array overrun is possible. The ‘9’ index is pointing beyond array bound. VirtualDub f_convolute.cpp 73

This is one of the simplest errors causing an array overrun. Index 9 is used explicitly, though the last item’s index is 8. The author probably forgot, while writing this code, that array items in C/C++ are numbered starting with zero, not one. It happens when you have to switch between different programming languages.

Example 6. CPU Identifying Tool project. Array overrun (index in a macro).

#define FINDBUFFLEN 64  // Max buffer find/replace size
...
int WINAPI Sticky (...)
{
  ...
  static char findWhat[FINDBUFFLEN] = {'\0'};
  ...
  findWhat[FINDBUFFLEN] = '\0';
  ...
}

The error was found through the V557 diagnostic: Array overrun is possible. The ’64’ index is pointing beyond array bound. stickies stickies.cpp 7947

This error is a kind of the previous one. The terminal null is written outside the array. The correct code is: “findWhat[FINDBUFFLEN – 1] = ‘\0’;”.

Example 7. Wolfenstein 3D project. Array overrun (incorrect expression).

void BotTeamAI( bot_state_t *bs ) {
  ...
  bs->teamleader[sizeof( bs->teamleader )] = '\0';
  ...
}

The error was found through the V557 diagnostic: Array overrun is possible. The ‘sizeof (bs->teamleader)’ index is pointing beyond array bound. game ai_team.c 548

Here is one more example of an array overrun when using an explicitly declared index. These samples show that such simple at first sight errors are much more widely-spread than it may seem.

The terminal null is written outside the ‘teamleader’ array. This is the correct code:

bs->teamleader[sizeof( bs->teamleader ) - 1] = '\0';

Example 8. Miranda IM project. Only part of a string is copied.

typedef struct _textrangew
{
  CHARRANGE chrg;
  LPWSTR lpstrText;
} TEXTRANGEW;

const wchar_t* Utils::extractURLFromRichEdit(...)
{
  ...
  ::CopyMemory(tr.lpstrText, L"mailto:", 7);
  ...
}

The error was found through the V512 diagnostic: A call of the ‘memcpy’ function will lead to a buffer overflow or underflow. tabsrmm utils.cpp 1080

If Unicode-strings are used, one character occupies 2 or 4 bytes (depending on the data model being used in compiler) instead of one byte. Unfortunately, programmers easily forget about it, and you can often see defects like our example in programs.

The ‘CopyMemory’ function will copy only part of the L”mailto:” string since it handles bytes, not characters. You can fix the code by using a more appropriate function for string copying or, at least, multiplying number 7 by sizeof(wchar_t).

Example 9. CMake project. Array overrun inside a loop.

static const struct {
  DWORD   winerr;
  int     doserr;
} doserrors[] =
{
  ...
};

static void
la_dosmaperr(unsigned long e)
{
  ...
  for (i = 0; i < sizeof(doserrors); i++)
  {
    if (doserrors[i].winerr == e)
    {
      errno = doserrors[i].doserr;
      return;
    }
  }
  ...
}

The error was found through the V557 diagnostic: Array overrun is possible. The value of ‘i’ index could reach 367. cmlibarchive archive_windows.c 1140, 1142

The error handler itself contains an error. The sizeof() operator returns the array size in bytes and not the number of items inside it. As a result, the program will try to search much more items than it should in the loop. This is the correct loop:

for (i = 0; i < sizeof(doserrors) / sizeof(*doserrors); i++)

Example 10. CPU Identifying Tool project. A string is printed into itself.

char * OSDetection ()
{
  ...
  sprintf(szOperatingSystem,
          "%sversion %d.%d %s (Build %d)",
          szOperatingSystem,
          osvi.dwMajorVersion,
          osvi.dwMinorVersion,
          osvi.szCSDVersion,
          osvi.dwBuildNumber & 0xFFFF);
  ...
  sprintf (szOperatingSystem, "%s%s(Build %d)",
           szOperatingSystem, osvi.szCSDVersion,
           osvi.dwBuildNumber & 0xFFFF);
  ...
}

This error was found through the V541 diagnostic: It is dangerous to print the string ‘szOperatingSystem’ into itself. stickies camel.cpp 572, 603

An attempt of formatted printing of a string into itself can lead to bad consequences. The result of executing this code depends on the input data, and you cannot predict what will happen. Most likely, the result will be a meaningless string or an Access Violation will occur.

This error can be referred to the category “code vulnerabilities”. In some programs, by feeding special data to code, you can exploit such code fragments to cause a buffer overflow or other effects an intruder needs.

Example 11. FCE Ultra project. A string gets less memory than needed.

int FCEUI_SetCheat(...)
{
  ...
  if((t=(char *)realloc(next->name,strlen(name+1))))
  ...
}

The error was found through the V518 diagnostic: The ‘realloc’ function allocates strange amount of memory calculated by ‘strlen(expr)’. Perhaps the correct variant is ‘strlen(expr) + 1’. fceux cheat.cpp 609

This error is caused by a misprint. It is the ‘name’ pointer instead of the “name+1” expression that must be the argument of the strlen() function. As a result, the realloc function allocates 2 bytes less memory than needed: one byte is lost because 1 is not added to the string length; another byte is lost because the ‘strlen’ function calculates the string length skipping the first character.

Example 12. Notepad++ project. Partial array clearing.

#define CONT_MAP_MAX 50
int _iContMap[CONT_MAP_MAX];
...
DockingManager::DockingManager()
{
  ...
  memset(_iContMap, -1, CONT_MAP_MAX);
  ...
}

The error was found through the V512 diagnostic: A call of the memset function will lead to a buffer overflow or underflow. notepadPlus DockingManager.cpp 60

That’s one more example of how the number of array items is mixed up with an array size. A multiplication by sizeof(int) is missing.

We can go on and on showing you errors of array handling we have found in various programs. But we have to stop somewhere. Let it be 12, for number 13 is considered to be unlucky.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read