PortAudio: Portable Audio Processing for All Platforms

To write an audio application that samples, edits, or otherwise manipulates sound, the first decision you have to make is choosing which platform you want to lock yourself into. After all, even the most basic real-time audio playback functions are close to the bare metal of the operating system. If you're going to put time and maybe money into an audio development effort, of course you want the widest swath of platforms for release. PortAudio answers the call by delivering a free, cross-platform, open-source audio I/O library. It lets you write simple audio programs in C that will compile and run on many platforms, including Windows, Mac, and Linux/Unix.

PortAudio, which provides a very simple API for recording and/or playing sound using a simple callback function, is intended to promote the exchange of audio synthesis software between developers on different platforms. It includes example programs that synthesize sine waves and pink noise, perform fuzz distortion on a guitar, list available audio devices, and much more. Carnegie Mellon University's PortMusic project, which includes MIDI and soon will provide sound file support, recently selected PortAudio as its audio component.

Playing Musical Platforms

The PortAudio library supports an array of platforms including Windows, Linux, and Macintosh variants (see Table 1), but if you don't have prior audio development experience you quickly will find yourself adrift in a sea of API standards. After computer audio became mainstreamed with Windows 3.1 MultiMedia Extension (MME) and the ubiquitous .WAV file back in 1991, a variety of solutions followed. First came Direct Sound in the Windows 95 era, which unfortunately lacked a record capability. The Windows 2000/XP generation then introduced the fastest solution for Windows users: the Windows Driver Model.

Platform Code Minimum PortAudio Version Description
pablio 19.0 PortAudio Blocking I/O (PABLIO)
pa_asio 18.1 ASIO for Windows and Macintosh
pa_beos 18.1 BeOS
pa_jack 19.0 JACK for Linux and OSX
pa_linux_alsa 19.0 Advanced Linux Sound Architecture (ALSA)
pa_mac_sm 18.1 Macintosh Sound Manager for OS 8, 9, and Carbon
pa_mac_core 18.1 Macintosh Core Audio for OS X
pa_sgi 18.1 Silicon Graphics AL
pa_unix_oss 18.1 Open Sound System (OSS) implementation for various Unix variants
pa_win_ds 18.1 Windows Direct Sound
pa_win_wdmks 19.0 Windows Driver Model with Kernel Support (WDMKS)
pa_win_wmme 18.1 Windows MultiMedia Extension

Table 1. Platforms Supported by PortAudio

The latency requirements of your application should dictate your choice of API. If your sound program does not require a quick response time (close to a "live" performance), you are certainly free to use the MME or Direct Sound platform. However, if you require very low latency (below 20ms response time), you will need ASIO or WDMKS. The downside of ASIO is that it requires (usually) proprietary drivers that at best require end-user installation and at worst are not even available for cheaper audio systems. (For more details, refer to the SoundCard FAQ.)

Getting Ready to Sound Off

To start programming with PortAudio, the first thing you need to do is go to www.portaudio.com and pick out a relevant distro. Because V18.1, the last official release, is nearing the three-year-old mark, you might as well start with a current V19 code snapshot. (An older precompiled DLL for PortAudio V17 also is available, but that's all as of this writing.) Either way, it's a matter of unpacking a ZIP file or tarball, because PortAudio is pretty much distributed in a source-only format.

As you might expect with any streaming interface, PortAudio supports two different programming models: a blocking API and a non-blocking API. The non-blocking API was developed first. The blocking API came later and is still unofficial. Although simple command-line type tools can use a blocking API with little impact, a modern GUI application would need to invoke a thread to manage blocking I/O calls. Otherwise, the app looks dead to both the OS and the end-user during I/O.

This article examines only the non-blocking API. A typical non-blocking PortAudio application requires the following steps:

  1. Write a callback function that PortAudio (PA) will call when audio processing is needed.
  2. Initialize the PA library and open a stream for audio I/O.
  3. Start the stream: PA now will call your callback function repeatedly in the background.
  4. Inside your callback, you can read audio data from the inputBuffer and/or write data to the outputBuffer.
  5. Stop the stream by returning a 1 from your callback or calling a stop function.
  6. Close the stream and terminate the library.

Hello PortAudio, A Sample Application

Although ASIO, WMSDK, and DirectSound layers are available, the sample application discussed in this section uses the Windows MME, the lowest common denominator. First, you need to build a static library out of the following modules:

  1. "Common" base library
  2. "Win" platform library (You will disable Direct Sound and ASIO for simplicity's sake.)
  3. Layer-specific interface module

You do this from the DOS window by using Visual Studio C++ as follows (you may want to make this a .BAT file):

cd \pa_snapshot_v19\portaudio\pa_common
del *.lib
copy ..\pa_win
cl /c /DPA_NO_DS /DPA_NO_ASIO *.c
lib /out:portaudio.lib *.obj
cd ..\pa_win_wmme
cl /c pa_win_wmme.c /I../pa_common

On this foundation, you can pick out a test program and link the thing together to see how it goes:

cd ..\pa_tests
cl patest_saw.c /I../pa_common /link ..\pa_common\portaudio.lib
   ..\pa_win_wmme\pa_win_wmme.obj
Note: In the preceding three lines of code, lines two and three should be one continuous line. The line was broken only to display properly on this Web page.

What you get is about five seconds of pure sawtooth wave pleasure! But, that's not the point. You now have a platform-independent, sound-synthesizing piece of code with which you could implement any number of effects.

PortAudio comes with about four dozen test programs. Look at the guitar fuzz distortion box simulator "pa_fuzz.c" (see below) so you can rock on like Peter Frampton and Joe Walsh. Use essentially the same build command as before:

cd ..\pa_tests
cl patest_toomanysines.c /I../pa_common /link ..\pa_common\portaudio.lib
                           ..\pa_win_wmme\pa_win_wmme.obj

                                 pa_fuzz.c:

  1 #include <stdio.h>
  2 #include <math.h>
  3 #include "portaudio.h"
  4 /*
  5 ** Note that many of the older ISA sound cards on PCs do NOT
  6 ** support full duplex audio (simultaneous record and playback).
  7 ** And some support only full duplex at lower sample rates.
  8 */
  9 #define SAMPLE_RATE          (44100)
 10 #define PA_SAMPLE_TYPE       paFloat32
 11 #define FRAMES_PER_BUFFER    (64)
 12
 13 typedef float SAMPLE;
 14
 15 /* Non-linear amplifier with soft distortion curve. */
 16 float CubicAmplifier( float input )
 17 {
 18    float output, temp;
 19    if( input < 0.0 ) {
 20       temp = input + 1.0f;
 21       output = (temp * temp * temp) - 1.0f;
 22    } else {
 23       temp = input - 1.0f;
 24       output = (temp * temp * temp) + 1.0f;
 25    }
 26    return output;
 27 }

You can represent the signal in many ways with PortAudio. The most common mechanism is to use float values from -1.0 to +1.0 to represent the audio signal (paFloat32). You can also use 16-bit integers if you are more comfortable with that or some other representation. The CubicAmplifier() function simulates the distortion that an analog amplifier would produce, the mathematics of which are beyond the scope of the current discussion.

PortAudio: Portable Audio Processing for All Platforms

 28 #define FUZZ(x)
    CubicAmplifier(CubicAmplifier(CubicAmplifier(CubicAmplifier(x))))
 29
 30 static int gNumNoInputs = 0;
 31 /* This routine will be called by the PortAudio engine
    /* when audio is needed.
 32 ** It may be called at interrupt level on some machines, so
    ** don't do anything that could mess up the system, like
 33 ** calling malloc() or free().
 34 */
 35 static int fuzzCallback( const void *inputBuffer,
                             void *outputBuffer,
 36                          unsigned long framesPerBuffer,
 37                          const PaStreamCallbackTimeInfo* timeInfo,
 38                          PaStreamCallbackFlags statusFlags,
 39                          void *userData )
 40 {
 41    SAMPLE *out = (SAMPLE*)outputBuffer;
 42    const SAMPLE *in = (const SAMPLE*)inputBuffer;
 43    unsigned int i;
 44
 45    if( inputBuffer == NULL ) {
 46       for( i=0; i<framesPerBuffer; i++ )
 47          {
 48             *out++ = 0;   /* left  - silent */
 49             *out++ = 0;   /* right - silent */
 50          }
 51       gNumNoInputs += 1;
 52    } else {
 53       for( i=0; i<framesPerBuffer; i++ )
 54          {
 55             *out++ = FUZZ(*in++);   /* left  - distorted */
 56             *out++ = *in++;         /* right - clean */
 57          }
 58       }
 59
 60       return paContinue;
 61 }

The PortAudio system is designed to work in a near real-time environment, thus the use of callback functions. The fuzzCallback() function sends an input buffer, output buffer, number of frames (for example, samples), time sequence, buffer status flags, and a pointer to a user-defined storage area. A frame in an input or output buffer contains a complete set of samples for all channels involved (in this case, two for stereo). The program has as many tuples as specified by the incoming frameCount (which may be zero); you've asked for 64 samples (FRAMES_PER_BUFFER).

Although this example uses two-channel audio, you can set up any number of channels. The fuzzCallback() function generates an empty buffer in the case of "no input." If you do have input, you fuzz the left channel (zero) and copy the input clean on the right channel (one). If your distortion was sensitive in the time domain, you could use the timeInfo struct to retrieve the following times in seconds:

  1. When the first sample of the input buffer was received at the audio input
  2. When the first sample of the output buffer will begin being played at the audio output
  3. When the stream callback was called
 63 /*************************************************************/
 64 int main(void)
 65 {
 66    PaStreamParameters inputP, outputP;
 67    PaStream *stream;
 68    PaError err;
 69
 70    err = Pa_Initialize();
 71    if( err != paNoError ) goto error;
 72
 74    inputP.device = Pa_GetDefaultInputDevice();
       /* default input device */
 75    inputP.channelCount = 2;    /* stereo input */
 76    inputP.sampleFormat = PA_SAMPLE_TYPE;
 77    inputP.suggestedLatency = Pa_GetDeviceInfo( inputP.device )
                                 ->defaultLowInputLatency;
 78    inputP.hostApiSpecificStreamInfo = NULL;
 79
 80    outputP.device = Pa_GetDefaultOutputDevice();
       /* default output device */
 81    outputP.channelCount = 2;    /* stereo output */
 82    outputP.sampleFormat = PA_SAMPLE_TYPE;
 83    outputP.suggestedLatency = Pa_GetDeviceInfo( outputP.device )
                                  ->defaultLowOutputLatency;
 84    outputP.hostApiSpecificStreamInfo = NULL;
 85
 86    err = Pa_OpenStream(
 87       &stream,
 88       &inputP,
 89       &outputP,
 90       SAMPLE_RATE,
 91       FRAMES_PER_BUFFER,
 92       0,     /* paClipOff, */
                 /* we won't output out of range samples so don't
                  * bother clipping them */
 93       fuzzCallback,
 94       NULL );
 95    if( err != paNoError ) goto error;

Initializing PA and opening the stream are next. Pa_Initialize() must of course be the first PortAudio call your application uses, just as Pa_Terminate() is the last. After that, you need to set up the parameters of your input streams and output streams. The default input device is usually Microsoft Sound Mapper, which flows from the line-in input of your soundcard (or equivalent). Other possible inputs might be your modem input, CD audio, or other things depending on drivers and hardware. You also could create sophisticated callback algorithms where you mix multiple channels down to one channel or vice-versa.

Finally, you are ready to call Pa_OpenStream() and get the streams ready for immediate use. Because latency is always your enemy, separate opening the stream from starting the stream. The input and output channels must agree to the same sample rate (in this case, CD quality 44100Hz) and the same number of samples per buffer-load.

 96
 97    err = Pa_StartStream( stream );
 98    if( err != paNoError ) goto error;
 99
100    printf("Hit ENTER to stop program.\n");
101    getchar();
102    err = Pa_CloseStream( stream );
103    if( err != paNoError ) goto error;
104
105    printf("Finished. gNumNoInputs = %d\n", gNumNoInputs );
106    Pa_Terminate();
107    return 0;
108
109 error:
110    Pa_Terminate();
111    fprintf( stderr, "An error occurred while using the
                         portaudio stream\n" );
112    fprintf( stderr, "Error number: %d\n", err );
113    fprintf( stderr, "Error message: %s\n",
                Pa_GetErrorText( err ) );
114    return -1;
115 }

At first glance, the remainder of the program may leave you scratching your head. The Pa_StartStream() calls a platform-specific function to get a thread going, which begins callbacks immediately. The Win32 implementations all eventually call CreateThread(), although to me the WDMKS code seems a lot simpler than the Win MME version. The two ways to get out of the callback loop are returning a value of 1 or calling Pa_CloseStream().

Get Creative

Your creativity is the limit to what you can do with PortAudio: convert data streams from one format to another in real time, simulate surround sound or other sophisticated multi-channel audio, or even create performance-quality effects. Best of all, you aren't overcommitted to any platform, which makes PortAudio my choice for open source audio projects.

About the Author

Victor Volkman has been writing for C/C++ Users Journal and other programming journals since the late 1980s. He is a graduate of Michigan Tech and a faculty advisor board member for Washtenaw Community College CIS department. Volkman is the editor of numerous books, including C/C++ Treasure Chest and is the owner of Loving Healing Press. He can help you in your quest for open source tools and libraries; just drop an e-mail to sysop@HAL9K.com.



About the Author

Victor Volkman

Victor Volkman has been writing for C/C++ Users Journal and other programming journals since the late 1980s. He is a graduate of Michigan Tech and a faculty advisor board member for Washtenaw Community College CIS department. Volkman is the editor of numerous books, including C/C++ Treasure Chest and is the owner of Loving Healing Press. He can help you in your quest for open source tools and libraries, just drop an e-mail to sysop@HAL9K.com.

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • On-demand Event Event Date: September 10, 2014 Modern mobile applications connect systems-of-engagement (mobile apps) with systems-of-record (traditional IT) to deliver new and innovative business value. But the lifecycle for development of mobile apps is also new and different. Emerging trends in mobile development call for faster delivery of incremental features, coupled with feedback from the users of the app "in the wild." This loop of continuous delivery and continuous feedback is how the best mobile …

  • Packaged application development teams frequently operate with limited testing environments due to time and labor constraints. By virtualizing the entire application stack, packaged application development teams can deliver business results faster, at higher quality, and with lower risk.

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds