// JP opened flex table

Click to See Complete Forum and Search --> : Worker thread vs Idle-loop performance question


GeorgeCochran
April 16th, 2004, 01:48 AM
I do some consulting work in which I write custom stand-alone dialog-based apps in VC6 of the following style: The program opens with a modal dialog box in which the user enters some parameters. Then the program starts a lengthy series of mathematical calculations that takes hours and sometimes days to finish. During these calculations I put a modeless dialog box on the screen to show the user the progress in the calculation, an estimated time to completion, and two buttons that allow the user to stop the program (one to save the data to resume later, the other to just cancel the run). The progress dialog class uses a timer to update itself every 3 seconds with data from the calculation.

For example, the last such project that I finished required 8 days to run on a 3GHz P4 desktop. The point is that I’m very interested in getting the calculations running as fast as possible.

When I first started writing these programs, I naively wrote one containing the two dialog boxes and a procedure for the math, and then I let it rip. The result was a frozen computer since the progress dialog (and any other app running on the machine) didn’t get to process any messages since the math procedure was hogging all the processor time. So after studying the MS no-help files, I’ve tried three different modifications of this basic model to keep from freezing the computer.

1. A console app using mfc in which I don’t use the console. Apparently console apps generated by the AppWizard come with a built-in message loop that allows others time on the processor. The drawback is that there’s a black window on the screen that I don’t use.

2. Making the math calculation into a worker thread, and letting the main thread do the dialog boxes. By trial and error I’ve discovered that using THREAD_PRIORITY_ABOVE_NORMAL results in a frozen computer, and using a NORMAL priority results in a very sluggish feeling computer. So I’ve been setting the thread priority at BELOW_NORMAL. Also, there are a small number of variables in the app class that the two threads use to exchange data about their states (the worker thread gets access to them through a pointer that was passed to it); I’ve not worried about possible clashes with this and so far it hasn’t been a problem.

3. Instead of multi-threading, I’ve put a message loop in the math procedure that is called after each series of calculations (once in each iteration of a long for() loop), that basically implements something like idle-loop processing. I’m using the code below, which I copied out of a MS help file. The original version in the help file did not run, and I discovered by trial and error that commenting out the lines marked resulted in something that seems to work correctly.

void CMyApp::IdleLoop()
{
// BOOL bDoingBackgroundProcessing = TRUE; (commented out)

// while ( bDoingBackgroundProcessing ) (commented out)
{ MSG msg;
while ( ::PeekMessage( &msg, NULL, 0, 0, PM_NOREMOVE ) )
{
if ( !PumpMessage( ) )
{
// bDoingBackgroundProcessing = FALSE; (commented out)
// AfxPostQuitMessage(exitcode); (commented out)
break;
}
}
// let MFC do its idle processing
LONG lIdle = 0;
while ( AfxGetApp()->OnIdle(lIdle++ ) );
}


So here’s my question: Which of these three models would result in the fastest code? Presumably the multi-threading model would require the processor to continually change threads, and that overhead must take some time. The Idle-Loop model has the overhead of calling the idle-loop procedure and waiting for all other messages to be processed before getting back to the math calculation. The console model presumably has some overhead too, but I don’t understand the guts of that very well. In fact, I don’t really understand the guts of worker threads and message loops either, and I don’t want to invest time in learning about what I view as administrative trivia. I want to focus my time on programming the procedure that does the math and not worry about all these multi-tasking details. Suggestions?

Andreas Masur
April 16th, 2004, 04:58 AM
l...how many calculations are running at the same time? If there is a sequential processing....then I would simply use one worker thread which does the calculations and updates the GUI respectively...

simpleman
April 16th, 2004, 05:04 AM
Hi GeorgeCochran

i think one worker thread model for math calculation is the best idea.

because the main job of your application is math calculation which means CPU base job, so it does not have CPU idle time. if then i think trying to check the idle time using Idle-Loop is usless task.

main dialog job
- update the progress using SetTimer()/OnTimer()
- process the button event

i think it is the most simple and best performance for your application. ^^

good luck...

Andreas Masur
April 16th, 2004, 05:37 AM
Originally posted by simpleman
because the main job of your application is math calculation which means CPU base job, so it does not have CPU idle time. if then i think trying to check the idle time using Idle-Loop is usless task.

Well...using idle loop means that the calculations are being done while the application is idle...

Originally posted by simpleman
main dialog job
- update the progress using SetTimer()/OnTimer()
- process the button event

That would be not the best way of doing it...how would you know inside the timer handler how much of the work has been processed? The workerthread should be responsible for updating the progress control since the thread is the only part which knows the actual percentage/time of the calculations...

Furthermore...what button event?

GeorgeCochran
April 16th, 2004, 03:44 PM
>>>l...how many calculations are running at the same time? If there is a sequential processing....

The structure of the math calculations is iterative and strictly sequential. Typically there are several nested loops, each loop is iterated many thousands of times, and the body of the innermost loop is perhaps 100-150 lines of C-code. The results from each pass through the inner loop are used in the next pass. The inner loop code could not realistically be vectorized.

>>>...how would you know inside the timer handler how much of the work has been processed? The workerthread should be responsible for updating the progress control since the thread is the only part which knows the actual percentage/time of the calculations...

Well the way I've been doing it (which is probably a bad design) is: The app class has two member data structures, one of which holds the initial parameters selected by the user in the opening modal dialog, the other holds information about the current state of the long math calculation. When the worker thread is started, I pass to the worker function a pointer to the global app object; it uses the pointer to read the initial starting parameters, and then after each pass through one of the loops it will write some progress data to the app object. The modeless dialog class, meanwhile, starts a timer in its constructor that rings every 3 seconds, and it has an OnTimer handler which sneeks a peak at the app's data before calling UpdateData to repaint the dialog. The modeless dlg is maintained by the main thread. I don't bother with trying to syncronize the two thread's access to the app's data.

>>>Furthermore...what button event?

The modeless dialog has two buttons to allow the user to stop the run. The button handlers set a boolean variable called "run" in the app class, which the worker thread will peek at when it is sending progress data to the app object. Since this occurs only at the end of one of the loops, it is easy for the thread to exit there if it finds the variable set to FALSE.

My main question is whether the worker thread model would run faster than the idle-loop or console model.

Andreas Masur
April 16th, 2004, 04:00 PM
Originally posted by GeorgeCochran
The structure of the math calculations is iterative and strictly sequential. Typically there are several nested loops, each loop is iterated many thousands of times, and the body of the innermost loop is perhaps 100-150 lines of C-code. The results from each pass through the inner loop are used in the next pass. The inner loop code could not realistically be vectorized.

Well...this makes a workerthread perfectly suitable...

Originally posted by GeorgeCochran
Well the way I've been doing it (which is probably a bad design) is: ...

Well...okay...looking at it from a more object-oriented point of view, you should pass the initial parameter set to the thread (by using a dynamically allocated structure). Inside the thread you then dynamically allocate the current state structure, fill it and send a user-defined message back to the main thread (GUI) - passing the current state information (using a pointer). The message handler inside the main thread then updates the progress control accordingly and deletes the passed pointer to the current state structure. At the end of the workerthread (or as soon as it will stop) you delete the passed initial parameter set.

Originally posted by GeorgeCochran
My main question is whether the worker thread model would run faster than the idle-loop or console model.
Well...to provide a definite answer, one would need to profile the code and actually see whether one is faster than the other. However, 'OnIdle()' does some more things (cleaning temporary handles etc.) which are not important to your calculations, thus, I would go with the workerthread...

MikeAThon
April 16th, 2004, 09:20 PM
The worker thread approach is possibly the cleanest. Theoretically, however, you should know that on a single-processor machine, the creation of a worker thread will always result in a slower overall process, because of the overhead involved in thread-switching and the like.

For an example of a single-threaded approach which is similar to your current option (3) but without reliance on OnIdle, please read my article "Lengthy Operations Without Multiple Threads" (http://www.codeproject.com/threads/TemplatedLengthyOperation.asp)

-Mike

CJ1
April 20th, 2004, 04:26 PM
You are giving conflicting requirements: Fast Code; Responsive Machine.

- For fast code, hog the CPU, put the entire app as high priority, sleep the thread for the display when not used and optimize your inner loops.

- For a responsive machine, processing during idle time is kind to the user but your program runs slow. For normal office use PCs, you'll still get 80-90% of the CPU time this way.

Best bet is a dual (or more) cpu machine. They are cheap and you can run you worker thread as time-critical (hogging one CPU) without impacting the overall performance (compared to a single CPU).

In any case, spend time to optimize the inner loops to really get the best peformance for all cases!

//JP added flex table