Building the Right Environment to Support AI, Machine Learning and Deep Learning
Going back about a decade, remember the technique to speed up your computer? Yes, over clocking, to get better performance from existing hardware, mainly processors. It still exists around us because software performance has traditionally benefited from processors' increasing clock speeds.
People among us from the High Performance Computing (HPC) Industry—commonly game developers, financial application developers, people developing real time data analysis applications, 3D visualizing or medical imaging, and so forth—are always looking to add every single bit that can result in any significant beneficial difference in their products, but the simple over clocking solution doesn't work; processor vendors have all changed course and focused on adding multiple cores to achieve increased performance.
If you are not a veteran of the HPC industry, perhaps the notion of multi-core programming freaks you out and brings a lot of questions with fears to you. Complexity, low-level interfaces, specific understanding of hardware architecture, multi-threading with its issues like dead-locks, being specific to a hardware platform and what else and in a competitive market, the price that you have to pay in terms of cost, efforts, and time, might be the unmotivated factors for you to defer multi-core programming for now.
Don't you think that it would be better if there were some SDK that can take care of all this (okay, not all but most of it) multi-core complexity? The answer is yes, there is. In fact, there are not one but two solutions:
Although PeakStream doesn't exist around us, as Google has taken over PeakStream, which leaves RapidMind only in the race. The release of RapidMind Multi-core Software Platform brought new language features, an improved runtime API and support for the Cell BE along with NVIDIA and ATI/AMD GPUs, with a comprehensive support for Windows and Linux-based development. The RapidMind Multi-core Software Platform allows software developers to embrace multi-core processors, including GPUs and the Cell BE, to deliver higher performing software with an order of magnitude less effort.
The RapidMind Multi-core Software Platform is a software development platform that allows developers to use standard C++ programming to create high-performance and massively parallel applications or to extend existing applications to run on high-performance processors, including CPUs, GPUs, or Cell BE. The RapidMind Multi-core Software Platform is not a separate IDE, but instead works with your current IDE to provide immediate ease of use. You are given a package of header files, libraries, samples, and documentation to use in your applications.
The RapidMind Multi-core Software Platform lets you develop the application just like any other single threaded application, without the challenges of understanding the processor hardware or complex parallel programming techniques. RapidMind Multi-core Software Platform executes and manages platform-specific computations and data across all cores with the following Hardware, OS, and Compiler Support.
The Developer Edition of RapidMind Multi-core Software Platform for 32- and 64-bit systems is available to download free from http://www.rapidmind.net/downloadeval.php.
Traditional Multi-Threading Model
Figure 1 shows a graphical representation of traditional multi-threading model to achieve multiple cores performance.
Figure 1: RapidMind Multi-core Software Platform (RMDP)
The RapidMind Multi-core Software Platform is presented as an advanced dynamic compiler and runtime management system for parallel processing. It has a sophisticated interface embedded within standard ISO C++. It can be used to express arbitrary parallel computations, but it is not a new language. Instead, it merely adds a new vocabulary to standard ISO C++: a set of nouns (types) and verbs (operations). A user of the RapidMind Multi-core Software Platform writes C++ code in the usual way, but uses specific types for numbers, vectors of numbers, matrices, and arrays. In immediate mode, operations on these values can be executed on the host processor, in the manner of a simple operator-overloaded matrix-vector library. In this mode, the RapidMind Multi-core Software Platform simply reflects standard practice in numerical programming under C++.
However, the RapidMind Multi-core Software Platform also supports a unique retained mode. In this mode, operations are recorded and dynamically compiled into a "program object" rather than being immediately executed. These program objects can be used as functions in the host program. Program objects mimic the behavior of native C++ functions, including support for modularity and scope, so standard C++ object-oriented programming techniques can be leveraged. It should be noted that at runtime, program objects only execute the numerical computations they have recorded, and can completely avoid any overhead due to the object-oriented nature of the specification. The platform uses C++ only as scaffolding to define computations, but rips away this scaffolding for more efficient runtime execution.
By using existing C++ compilers and programming environments (IDEs), application developers using RapidMind are given a small set of types to create parallel programs within their existing C++ application:
- Value: Contains fixed-length data, similar to the primitive types such as float and int in C++
- Array: Contains RapidMind values, like C arrays or C++ vectors
- Program: Contains computations, encapsulate computation, in the same way that a C++ function does
When using the RapidMind Multi-core Software Platform, developers continue to program in C++. After identifying components of their application to accelerate, the overall process of integration is as follows:
- Replace types: The developer replaces numerical types representing floating point numbers and integers with the equivalent RapidMind Multi-core Software Platform types.
- Capture computations: While the user's application is running, sequences of numerical operations invoked by the user's application can be captured, recorded, and dynamically compiled to a program object by the RapidMind Multi-core Software Platform.
- Stream execution: The RapidMind Multi-core Software Platform runtime is used for managed parallel execution of program objects on the target hardware platform, which can be a GPU, the Cell processor, or a multi-core CPU.