Developing Fast Thread-Safe Code

Environment: VC 5/6, Windows 9X/NTX.X

Introduction

One of the many great features in Windows 9x and Windows NT is multiple threads. The power of multithreaded programming requires responsible programming with detailed design of the system. It’s up to you to ensure that the CPU switches between threads seamlessly. For example, you might have data (such as a linked list) that can be accessed by more than one thread. Your code needs to ensure that a thread switch at the wrong moment won’t leave your data in an inconsistent state. You prevent thread-switching problems by using synchronization objects.

In Win32, there are four types of synchronization objects. For synchronizing threads in the same or different processes, the Win32 API provides mutexes, events, and semaphores. The fourth type of synchronization object, critical sections, works only for threads within the same process. The problem I’m examining involves critical sections and an operating system behaviour.

Critical Section

When multiple threads have shared access to the same data, the threads can interfere with one another. A critical section object protects a section of code from being accessed by more than one thread. A critical section is limited, however, to only one process or DLL and cannot be shared with other processes. The advantage of critical sections is that they are less CPU intensive than the other synchronization methods. To use a critical section to guard a particular piece of code, you call EnterCriticalSection and pass in the address of a CRITICAL_SECTION structure. At the end of the code to be guarded, you call LeaveCriticalSection, passing in the same address of the CRITICAL_SECTION structure used earlier. Critical sections work by having a thread call the EnterCriticalSection or TryEnterCriticalSection functions to indicate that it has entered a critical section of code. If another thread calls EnterCriticalSection and references the same critical section object, it is blocked until the first thread calls the LeaveCriticalSection function.

A critical section can also protect more than one non-contiguous section of code as long as all such sections of code are protected by the same critical section object. For example, sections of code to add, delete, or modify a particular data can be protected by using the same critical section object so that only one of these sections of code will be able to access the data at a time.

You also need to initialize and destroy the critical section, but these two actions only need to occur once per instance of the program. Incidentally, the CRITICAL_SECTION structure that you pass to the critical section functions has to be either a global variable or in an allocated memory block. Never try to be smart like me and declare your CRITICAL_SECTION structure as a local variable within a function.

To Initialize the CRITIAL_SECTION, use the InitializeCriticalSection API. After initializing the CRITICAL_SECTION, a thread that successfully calls EnterCriticalSection is said to own the critical section. The thread continues to own the critical section until it calls LeaveCriticalSection. If a second thread comes along and calls code that’s guarded by a critical section, the second thread will block inside the call to EnterCriticalSection. The only way for the second thread to continue executing is for the first thread to give up ownership of the critical section by calling LeaveCriticalSection. In this way, critical sections are able to ensure that only one thread at a time can execute through the guarded region of code. Delete it if it is no longer need using DeleteCriticalSection.

What Is Actually Happening

A critical section is a section of code that requires exclusive access to some set of shared data before it can be executed and that is used only by the threads within a single process. A critical section is like a turnstile through which only one thread at a time may pass, working as follows:

  1. To ensure that no more than one thread at a time accesses shared data, a process’s primary thread allocates a global CRITICAL_SECTION data structure and initializes its members. A thread entering a critical section calls the Win32 function EnterCriticalSection and modifies the data structure’s members.
  2. A thread attempting to enter a critical section calls EnterCriticalSection, which checks to see whether the CRITICAL_SECTION data structure has been modified. If so, another thread is currently in the critical section and the subsequent thread is put to sleep. A thread leaving a critical section calls LeaveCriticalSection, which resets the data structure. When a thread leaves a critical section, Microsoft Windows NT or Microsoft Windows 2000 wakes up one of the sleeping threads, which then enters the critical section.

Differences Between Other Synchronizing Objects

Critical sections are significantly faster than mutex kernel objects—or any other kernel object. Another important feature is that, by using Critical Sections, we can create functions that are as powerful as an Interlocked family of functions. The Interlocked functions are implemented entirely in user-mode space and do not require your thread to go from user mode to kernel mode and back again. For this reason, entering a critical section typically requires only 10 or so CPU instructions to execute. On the flip side, when you call WaitForSingleObject and the like, you are forcing your thread to transition to kernel mode and back. This transition typically requires 600 CPU instructions on an x86 processor. That’s a huge difference!

You’re right to want to use a critical section instead of a mutex to increase your performance. Critical sections are fast—as long as there is no contention for the shared resource. As soon as a thread attempts to enter a critical section owned by another thread, critical sections degrade to using an event kernel object requiring approximately 600 CPU instructions to enter. Because contention is so rare, entering a critical section usually takes the high-speed, 10 CPU instruction paths.

A critical section object performs exactly the same function as a mutex except that critical sections may not be shared. They are visible only within a single process. Critical sections and mutexes both allow only one thread to own them at a time, but critical sections work more quickly and involve less overhead.

The functions for working with critical sections do not use the same terminology as the functions for working with mutexes, but they do roughly the same things. Instead of creating a critical section, you initialize it. Instead of waiting for it, you enter it. Instead of releasing it, you leave it. Instead of closing its handle, you delete the object.

Programming Thread Safe Functions

Programming a thread-safe class never be a burden. Here, one example is given.

In one program, a 32bit unsigned int will be updated by various threads. But the constraint is that threads should not access the variable simultaneously.

Declare one global variable of type CRITICAL_SECTION with:

CRITICAL_SECTION _csIncrementLock;

unsigned int nCount;

Initialize CRITICAL_SECTION on startup by using the Win32 API void InitializeCriticalSection(LPCRITICAL_SECTION lpCriticalSection):

            ::InitializeCriticalSection(&_csIncrementLock);

function, which will increment the variable.

void InterLockIncrement(unsigned int *pnVal)
{
    ::EnterCriticalSection(&_csIncrementLock);
                           ++(*pnVal);
    ::LeaveCriticalSection(&_csIncrementLock);
}

Now, you can call the function from anywhere in the program, without bothering about the multiple access on the variable.


// In the thread function.

InterLockIncrement(&nCount);

Before Process termination, delete the CRITICAL_SECTION using the Win32 API void DeleteCriticalSection (LPCRITICAL_SECTION lpCriticalSection):

//Delete before termination.

::DeleteCriticalSection(&_csIncrementLock);

Hidden Dragon

Now, let’s turn our attention to deadlock, the dreaded black hole of multithreaded programming. A deadlock occurs when two or more threads each already own a synchronization object (such as a critical section), and need to acquire another synchronization object to continue executing. Deadlocks inspire much fear and loathing, as they’re usually timing dependent, and are often notoriously difficult to reproduce consistently. They’re usually the result of logic flaws in your programming; so traditional debugging tools aren’t much help in tracking them down.

As part of learning how to use multiple threads correctly, we programmers are supposed to constantly be on the lookout for logic flaws in our code where a deadlock can occur. When reviewing your code, always try to imagine how a thread switch at the wrong moment could throw a monkey wrench into the works.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read