Building the Right Environment to Support AI, Machine Learning and Deep Learning
When developing applications, most developers tend to think linearly through the logical steps needed to complete a task. While sequential thinking leads to working applications that are relatively easy to understand, such single-threaded applications are not able to benefit from the multiple cores in today's processors.
Before the multi-core era began, Intel and AMD launched faster processors each year, with ever-increasing clock speeds. Effectively, this meant that the same application code simply ran faster on each processor generation-a real-world case of a free lunch.
However, limitations to current processor technology mean that the fastest clock speeds are limited to around 3 GHz. However, manufacturers still need to come up with faster and faster processors to match the demand. Because raising clock speed is (currently) out of question, the only way to increase performance significantly is to increase the number of cores or execution units in the chips. These multiple-core processor are then able to execute instructions in parallel, thus providing more speed. Today's two- and four-core processors are only the beginning; in the future, 16, 32, and 64 core systems will be commonly available.
But unlike with increasing clock speed, as vendors add multiple cores, your application will not automatically run faster if you just sit on your laurels. The free lunch is over. Because most .NET applications are single-threaded by default (although they may use additional threads for such things as database connection pools), your application code will still run on a single core. For example, if you run a single-threaded .NET application on a PC with a quad-core processor, it will run on one core while the three other cores sit idle.
Surely, a quad-core processor is still able to run multiple applications faster compared to a traditional single-core processor. To some degree, that's true, because the Windows task scheduler can assign different processes to run on different cores. (The same thing would happen if you had multiple processors with a single core each.)
However, to be able to take full use of the multiple cores that are in even mainstream PCs these days, you need to make your application use more than one thread. That way, the operating system can schedule your application's threads into multiple cores for simultaneous execution. You need two separate skills to do this: one is the ability to identify possibilities where threading can help improve performance, the other is implementing that behavior.
Speaking of implementation, introducing multiple threads into an application is often easier said than done. In fact, using threads properly has been one programming's most difficult tasks-until now. Although .NET has provided threading support since .version 1.0, using the Thread class and the low-level locking mechanisms correctly requires skill that not all developers have.
To help more developers gain from the current processors, Microsoft is planning to include support for easier threading in the forthcoming version 4.0 of the .NET framework. For example, the new libraries support running for and foreach loop iterations in parallel with only small alterations to your code. Similarly, you can use a parallel version of LINQ to help boost the performance of your queries.
This article discusses the new parallel programming features available in the future releases of Visual Studio 2010 and .NET 4.0.
|Author's Note: Both the code and information in this article are based on the Beta 1 release of Visual Studio and .NET 4.0, which of course are subject to change in later releases. Still, the concepts discussed here should remain valid even in the final RTM version.|
Understanding the New Features in .NET 4.0
Figure 1. New Parallel Architecture: In.NET 4.0, the Task Parallel Library and Parallel LINQ sit above the new Concurrency Runtime.
When planning the next version of the .NET Framework, one key design consideration was to let developers harness the power of the current processors more easily (see Figure 1). The results of this planning and development work have culminated in a new concurrency runtime with supporting APIs. Both will be available to developers when Visual Studio 2010 and .NET 4.0 are released to manufacturing.
For .NET developers, the new API classes are probably the most interesting new features. The parallel API can further be divided into two parts: the Task Parallel Library (TPL), and Parallel LINQ (PLINQ). Both features help developers use processors more fully. You can think of the Task Parallel Library as a generic set of parallel capabilities, whereas PLINQ focuses on database (or object) manipulation.
Although having additional parallelism support in the .NET framework is great in itself, the story gets better once you bring Visual Studio's IDE into the mix. Although Visual Studio has had windows to help debug threaded applications for a long time, the new features in Visual Studio 2010 are aimed squarely at developers using the new parallel APIs.
For instance, Visual Studio 2010 has a new window called Parallel Tasks, which can show all tasks running at a given point in time (see Figure 2).Figure 3), which can help when debugging applications that perform parallelization through the Task Parallel Library. You will also get access to new performance measurement tools that can help you spot bottlenecks in your code.
Figure 3. Parallel Stacks: Visual Studio 2010's Parallel Stacks window provides a new way to peek at stacks.
When designing Task Parallel Library and PLINQ, Microsoft focused on making the features intuitive to use. For example, to run three simple tasks in parallel, you can use the Task Parallel Library as follows:
Parallel.Invoke( () => MyMethod1(), () => MyMethod2(), () => MyMethod3());
Looks easy! Next, assume you had a traditional LINQ query like this:
int numbers = new int; ... var over100 = from n in numbers where n > 100 select n;
To convert this query to a parallelized PLINQ version, simply add the AsParallel construct to the query:
var over100 = (from n in numbers where n > 100 select n).AsParallel();
Again, that's quite simple. After the change, PLINQ will attempt to parallelize the query, taking into account the number of processors (or processor cores) available. Although the preceding query is for illustration only (it actually wouldn't benefit much from parallelization), you'd make the AsParallel method call the same way for more complex queries that would benefit more. But before going into PLINQ specifics, it's worth exploring the TPL.