Click to See Complete Forum and Search --> : Problem with timeout while waiting for a mutex


Fixion
July 26th, 2007, 11:32 AM
I have a program in which multiple threads write to a single multi-line edit box. To make sure that only one thread has access to the edit box I protected it with a mutex something like this:

// create mutex upon initialization like this
m_hEditMutex = CreateMutex(NULL, FALSE, NULL);

void WriteToEdit(CString Something)
{
WaitForSingleObject(m_hEditMutex, INFINITE);

// ... write to edit control

ReleaseMutex(m_hEditMutex);
}

I don't particularly care which thread gets the handle first when the function is called simultaniously.

The problem; once in a while (seemingly random), WaitForSingleObject(...) hangs forever, like the mutex was never released. But it only seems to do this when both threads are running and actively trying to write to the edit control.

I rewrote the code so that there is a finite timout like this:

void WriteToEdit(CString Something)
{
DWORD dMutexState = WaitForSingleObject(m_hEditMutex, 1000);
std::string error;
switch(dMutexState)
{
case WAIT_ABANDONED:
error = "WAIT_ABANDONED";
break;
case WAIT_OBJECT_0:
error = "WAIT_OBJECT_0";
break;
case WAIT_TIMEOUT:
error = "WAIT_TIMEOUT";
break;
}

// write to the edit control

ReleaseMutex(m_hEditMutex);
}

From the above code if found that the mutex was timing out once in a while (1000 ms should be plenty of time). How is this possible if the above code is the only place I request ownership of the mutex? Why would it hang forever if it was released by the previous thread?

Do you guys think I have a logic error somewhere? Should I even be using a mutex for this purpose? Is it possible that this is messing with an other instance of the same program?

I don't see the possibility of the function returning before the mutex is released, but I can post the full code if needed.

S_M_A
July 26th, 2007, 12:07 PM
A longshot... do you have exception handlers in code that call WriteToEdit? It might be so that occasionally you have some exception within 'write to edit control' code, in that case ReleaseMutex won't be called.

Fixion
July 26th, 2007, 03:00 PM
A longshot... do you have exception handlers in code that call WriteToEdit? It might be so that occasionally you have some exception within 'write to edit control' code, in that case ReleaseMutex won't be called.

No, I use ReplaceSel() to write to the end of t he edit control.

It seems that WaitForSingleObject() works 100% when there is only one thread calling WriteToEdit(). But if I frequently call the function from two different threads, the program will eventually hang at WaitForSingleObject(). If I only call it when the probability of two threads accessing WriteToEdit() are low, then it doesn't hang, presumably because only one thread was accessing the function at one time.

Fixion
July 26th, 2007, 03:11 PM
The previous simplified version of the function may be a bit vague, so here's the full function:

void CDataSpy_ClientDlg::AddToCEdit(CEdit &Edit, LPCTSTR Text, bool PreMutex /*== FALSE*/)
{
if(!PreMutex)
{
#ifndef DEBUG
WaitForSingleObject(m_mMsgEditMutex, INFINITE);
#else
DWORD dMutexState = WaitForSingleObject(m_mMsgEditMutex, 1000);
std::string error;
switch(dMutexState)
{
case WAIT_ABANDONED:
error = "WAIT_ABANDONED";
break;
case WAIT_OBJECT_0:
error = "WAIT_OBJECT_0";
break;
case WAIT_TIMEOUT:
error = "WAIT_TIMEOUT";
dMutexState = ReleaseMutex(m_mMsgEditMutex);
break;
}
#endif
}

CString strLine;
unsigned int nLength = Edit.GetWindowTextLength();
// if the text in the edit control is close to the limit, delete some from the end
if(nLength > (Edit.GetLimitText() - 1000))
{
Edit.SetSel(0, 1000, true);
Edit.ReplaceSel("");
}

// add CR/LF to text if it's not the first line
if(nLength > 0)
strLine.Format(_T("\r\n%s"), Text);
else
strLine.Format(_T("%s"), Text);

// get the text length again
nLength = Edit.GetWindowTextLength();
// put the selection at the end of text
Edit.SetSel(nLength, nLength, false);
// replace the selection
Edit.ReplaceSel(strLine);

if(!PreMutex)
{
ReleaseMutex(m_mMsgEditMutex);
}
}

Here is how I use it inside both threads:

AddToCEdit(m_ceMessageLog, "blah");


One workaround, though messy, is to use a timeout and continue even if it timed out. If possible, I would like to figure out why this is happening instead.

googler
July 26th, 2007, 03:28 PM
Here's a likely cause of such a deadlock - Your accesses to Edit use another synchronization object encapsulated inside the UI components in order to gain access to UI data structures that support the CEdit control. One of your threads gains access to the implicit UI synchronization object by doing some UI related operation, and without releasing it, and then goes on to wait on m_mMsgEditMutex. The other thread, which owns m_mMsgEditMutex, is stuck on one of the operations on Edit waiting for the UI synchronization object.

What I suggest you do to diagnose this is add the line
if (dMutexState == WAIT_TIMEOUT) __debugbreak(); after the wait for m_mMsgEditMutex. Your program will stop in the debugger when a timeout occurs. Switch the focus of the debugger to your other thread. Look at the stack trace on the other thread to see which line in AddToCEdit it's stuck on, which Win32 UI function it's calling deeper in the call stack, and which UI synchronization object it's stuck on. This may help you understand which UI operation that the timeout thread did caused it to lock a UI synchronization object without releasing it.

Fixion
July 26th, 2007, 04:00 PM
I tried that a few times, the majority of the time the program was stuck here when it timed out.

// winocc.cpp
int CWnd::GetWindowTextLength() const
{
ASSERT(::IsWindow(m_hWnd) || (m_pCtrlSite != NULL));

if (m_pCtrlSite == NULL)
>>>> return ::GetWindowTextLength(m_hWnd);
else
{
CString str;

m_pCtrlSite->GetWindowText(str);
return (int)str.GetLength();
}
}

The other times it was about here:

// ... cut ...
// get the text length again
nLength = Edit.GetWindowTextLength();
// put the selection at the end of text
Edit.SetSel(nLength, nLength, false);
// replace the selection
>>>> Edit.ReplaceSel(strLine);
or
>>>> if(!PreMutex)
{
ReleaseMutex(m_mMsgEditMutex);
}
// ... cut ...

I say "about" because it was in the disassembly. Which is marked with a green triangle at either of the points shown above in the source code.

It never stopped anywhere else.

googler
July 26th, 2007, 04:17 PM
Are you calling AddToCEdit from inside a Window Procedure? Because if you are, that's the cause of the deadlock for you right there....
If you're calling AddToCEdit from inside the WinProc, that means no UI messages are getting processed while you're waiting for m_mMsgEditMutex. The function GetWindowTextLength needs to SendMessage to the WinProc to get the info. This means it waits for the message to be processed and return a value. However, since no messages are getting processed, you have a dead lock.
The solution for this is not to call AddToCEdit from inside the WinProc.

Fixion
July 26th, 2007, 04:54 PM
Are you calling AddToCEdit from inside a Window Procedure?
Umm... yes. The two threads accessing AddToCEdit are; a worker thread, and the WindProc its self (which also happens to contain AddToCEdit).

I can only think of two solutions:
1. Start a worker thread for the function that calls AddToCEdit from inside WindProc.
2. Do this:

while( WaitForSingleObject(m_mMsgEditMutex, 100) == WAIT_TIMEOUT )
{
MSG Msg;
if(GetMessage(&Msg, NULL, 0, 0) > 0)
{
TranslateMessage(&Msg);
DispatchMessage(&Msg);
}
}

Unfortunatly the second seems like a dirty way to do it. Do you know of a better or simpler way?

googler
July 26th, 2007, 05:09 PM
Do you know of a better or simpler way?No. Personally I'd move all calls to AddToCEdit out of the WinProc and instead spawn off two worker threads to make those calls. I understand this is the more complex solution, but WinProcs shouldn't be executing waits inside them because they should always remain responsive to events - remember, they also handle user input and OS paint requests. Unresponsive WinProcs often hang the entire Windows UI because the shell is trying to communicate with them.

Are you sure the 2nd solution is even allowed? - I mean, is it permitted to execute a secondary message loop inside the WinProc? The primary message loop is outside the WinProc (usually in WinMain) and dispatches the WinProc itself. So you have nested WinProc calls, which means the WinProc at least needs to be reentrant.

Arjay
July 27th, 2007, 01:57 AM
I have a program in which multiple threads write to a single multi-line edit box. To make sure that only one thread has access to the edit box I protected it with a mutex something like this:
I don't particularly care which thread gets the handle first when the function is called simultaniously.

The problem; once in a while (seemingly random), WaitForSingleObject(...) hangs forever, like the mutex was never released. But it only seems to do this when both threads are running and actively trying to write to the edit control.

From the above code if found that the mutex was timing out once in a while (1000 ms should be plenty of time). How is this possible if the above code is the only place I request ownership of the mutex? Why would it hang forever if it was released by the previous thread?

Do you guys think I have a logic error somewhere? Should I even be using a mutex for this purpose? Is it possible that this is messing with an other instance of the same program?

I don't see the possibility of the function returning before the mutex is released, but I can post the full code if needed.Chiming in late here, but there are several issues that come to mind:
1) Using WaitForSingleObject in a UI thread will cause the UI thread to freeze during the wait. If you need to wait for an object in the UI thread use MsgWaitForMultipleObjects (search this forum for how to pump messages while waiting).
2) The use of a mutex from within a process. Mutexes are great when used from threads in different processes (or when you are waiting on multiple synchronization objects), but for simple synchronization cases they are more of a chore to use than a critical section and are slower. A good rule of thumb is when synchronizing threads within a process, use a cs unless you can't. (pretty profound huh?)
3) Locking/Unlocking explicitly. Whether you call WaitForSingleObject/ReleaseMutex for a mutex or EnterCriticalSection/LeaveCriticalSection for a critical section, these explicit calls are a potential source of error. Why, because as you have found out that no matter the code path, you always need to make sure that the lock on the object is released. Sometimes that can be difficult when function return early or methods throw exceptions. Ideally, we need something that guarantees an object gets released. Enter RAII (resource allocation is initialization - see wiki). Using the RAII approach, you just lock and let a destructor worry about the unlock. More about that later.
4) It's not good to manipulate MFC window controls from secondary threads.
5) I don't want to sound preachy here, but abstract the thread part of your code from the MFC dialog part of the code. You can put all the threading and data retrieval code into a class and supply a few public thread-safe methods like start, stop, GetString( ) that you call from the dialog class. This makes it easier to debug your code (because you don't have to wade through a bunch of UI code to decipher your threading code (and vice versa).
6) This doesn't have to do with threading, but the editbox isn't really a good choice to display your text. The reason being is that it's only good for about 65K characters. Consider using a rich edit control or a listview control (but a listview won't work too well if you need to copy text).

Okay enough about commenting. I've created a sample MFC app called SimpleThread. I uses a secondary thread to make fake log entries into a string shared between two threads. For synchronization I've used a small critical section wrapper and the RAII approach to locking/unlocking. All the relevant threading code is inside a CLogMgr class.

The locking code for the shared string looks like:

// Called by Secondary Thread
void SetLogString( LPCTSTR szLog )
{
// RAII type critical section lock. Used to protect the share string resource.
// (locks on the next line and releases when the method goes out of scope)
CAutoLockT< CLockableCS > lock( &m_csLock );
m_sLog = szLog;
}


I'm not going to show the thread proc here (see the attached code), but it uses WaitForMultipleObjects that waits on a shutdown event and uses the timeout to trigger the putting a log item into the shared string. To stop the thread, a shutdown event is set which causes the thread to exit.

When an item is added to the shared string, PostMessage is used to send a user defined message to the main UI thread. We do this so the UI thread knows the shared string has changed. The message handler for this user message is:


// Message received when a log item is available (sent by secondary thread)
LRESULT CSimpleThreadDlg::OnLogReceived( WPARAM, LPARAM )
{
// Append the log string
m_sLog += m_LogMgr.GetLogString( );
m_sLog += _T("\r\n");

// Transfer the data from the log string into the edit control
// NOTE: it may be better to use a listbox or listview control
// rather than copying the whole string into the edit box each time
UpdateData( FALSE );
return 1;
}


For thread control, there's two methods exposed by CLogMgr, Start( ) and Stop( ). So the user can start and stop the thread whenever desired. Because of the abstraction within the CLogMgr code, the handling of the start button (which also functions as the stop button) becomes quite clean:


void CSimpleThreadDlg::OnBnClickedStartstop( )
{
if( ToggleStartStopButton( ) )
{
// Start the thread
m_LogMgr.Start( GetSafeHwnd( ), m_lDelay );
}
else
{
// Stop the thread
m_LogMgr.Stop( );
}
}


This code is a reasonable approach to starting, stopping a thread and sharing data between two threads in a thread-safe manner. In addition to the shared string protected by the wrapped critical section object, the interlockedExchanged api is used to provide thread safety to a couple of long values that are also shared between threads.

googler
July 27th, 2007, 03:05 AM
Do you know of a better or simpler way?Here's another solution

get rid of the mutex
the WinProc can call AddToCEdit directly
create a user-defined window message. Have the WinProc call AddToCEdit when it gets this message.
The worker thread doesn't call AddToCEdit directly, but instead posts the user-defined message to the WinProc.
What's left is to marshal the parameters through the user-defined message to AddToCEdit. You can package them in a struct, allocate the struct dynamically and pass a pointer in WParam or LParam. Then have the WinProc deallocate the struct after calling AddToCEdit.

Fixion
July 27th, 2007, 09:48 AM
Wow, thanks for all the help guys! Thanks for the example Arjay!

Let me try your suggestions, I'll probably have some more questions in a little while.