CWaveFile -- a Class for Working with and Representing Data from WAVEs

Environment: VC6 SP4, VC.NET

Introduction

Okay. I would like to start with a little explanation of digital sound and its archiving in computers. A long, long time ago, audio signals, like others, were processed and archived in continuous form. They were so-called analogous signals. This type of signals has a lot of advantages; one of them is that an analogous signal corresponds to the physical process of changing some value. For example, when we speak, our vocal cords generate sound vibrations that emit into space, and by means of an analogous apparatus, we can easily register and save it (on magnetic tape, for example). But we have also a great disadvantage here: This is the most faithful representation of some physical value, and when the form of analogous signal changes (under the influence of noise, for example), you can lose all information that this signal carries. In terms of information theory: Analogous signal has no superfluity.

Digital signal has no such disadvantage because the digital representation of a signal has superfluity. The form of a digital signal is just a number of 1s and 0s (electrical pulses with 1 and 0 amplitude). So, the digital signal must be able to carry any information that is coded in the sequence of 1s and 0s. This means that digital signals have only two stable states, ON and OFF. Because of this feature, they can be easily recovered even in the case when their form is heavily changed (furthermore, there are special types of coding—so-called antinoise coding—that can improve the stability of digital signals). Digital signals have been widely adopted in many branches: communication, navigation, medicine, audio signals processing, and computers.

I know that you are more interested in the problem: How is digital data stored on a computer, and how I can work with it? I don't want to go deeper in digital signals theory. You, as a programmer, have to know just one thing: A digital signal is an array of numbers (and you will get your own array with data if you read this article to the end). For a digital audio signal, it can be 8- or 16-bit numbers.

There are a lot of standards of storing digital audio (AU, VOC, WAVE, AIFF, AIFF-C, and IFF/8VX), but as a result of the fact that Microsoft Corporation uses WAVE files in its Windows operation system, they became the most popular.

WAVE File Format

All WAVE files correspond to the RIFF specification. So, these files satisfy the following conditions:

  1. They consist of separated data blocks, so-called chunks, that form a tree-based structure.
  2. Each block consists of a header and, in fact, data.
  3. The first (and main) block of any RIFF file is a RIFF block; it is like the tree's root.

But, let's go back to a WAVE file. A usual WAVE PCM file looks like this:

The file begins with the header RIFF; after it, two subblocks are defined—FMT and DATA. The RIFF contains of three elements: RIFF_ID, RIFF_SIZE, and RIFF_FORMAT.

struct RIFF
{
  _TCHAR riffID[4];         //contains identifier "RIFF"
  DWORD riffSIZE;           //File size minus 8 bytes
  _TCHAR riffFORMAT[4];     //contains identifier "WAVE"
};

After the RIFF header, there is a format descriptor block (FMT):

struct FMT
{
  _TCHAR fmtID[4];          //contains identifier: "fmt " (with
                            //space)
  DWORD fmtSIZE;            //contains the size of this block
                            //(for WAVE PCM 16)
  WAVEFORM fmtFORMAT;       //structure WAVEFORMATEX but without
                            //cbSize field
};

The WAVEFORMAT structure is the key to understanding WAVEs. It contains a lot of information that we need when working with a WAVE.

struct WAVEFORM
{
  WORD wFormatTag;          //format of digital sound
  WORD nChannels;           //Number of channels (1 for mono and
                            //2 for stereo)
  DWORD nSamplesPerSec;     //Number of samples per second
  DWORD nAvgBytesPerSec;    //Average number bytes of data per
                            //second
  WORD nBlockAlign;         //Minimal data size for playing
  WORD wBitsPerSample;      //Bits per sample (8 or 16)
};

And, at last, the DATA block:

struct DATA
{
  _TCHAR dataID[4];         //contains identifier: "data"
  DWORD dataSIZE;           //data size
};

That's all you need to know about the WAVE header; after it, data follows. Okay, now let's consider the CWaveFile interface:

class CWaveFile : protected CFileMap, public  CObject {
public:
  CWaveFile( LPCTSTR fileName );
    ~CWaveFile() {}
    WAVEFORM* GetWaveFormat() { return &pFMT->fmtFORMAT; }
    DATA* GetWaveData() { return pDATA; }
    LPVOID GetData() { return reinterpret_cast< LPVOID >
                     ( dataAddress ); }
    BOOL DrawData( CDC *pDC, RECT *pRect, CSize *pNewSize );
protected:
  PBYTE dataAddress;
  RIFF* pRIFF;
  FMT* pFMT;
  DATA* pDATA;
private:
  BOOL CheckID(_TCHAR* idPar,_TCHAR A, _TCHAR B, _TCHAR C,
               _TCHAR D);
    void ReadWave();
    void ReadRIFF();
    void ReadFMT();
    void ReadDATA();
    void DrawByte( CDC *pDC );
    void DrawWord( CDC *pDC );
};

As you can see, CWaveFile is inherited from CObject (no explanation here) and CFileMap. It is a very interesting class and I want you to focus your attention on it.

Memory-Mapped Files

Memory mapping is a very useful feature of the Windows system. The first time I implemented the CWaveFile class, I had no idea about memory mapping and my code needed to use buffers, copying data from it, and... and I had a lot of problems (but it was a working version). Memory mapping is a technique that provides file operation on a disk as if it were loaded to memory—through pointers! Working with files mapped to memory is very easy and fast; furthermore, you don't need any buffers! So then, I have got to know about this nice feature. I started looking on the Internet for what the community was thinking about file mapping and were there any finished classes to handle the mapped files? So, I have found a very useful class written by Vitali Brusentsev (thanks, Vitali!). His class encapsulates all the functionality you need to work with files mapped to memory. So, I asked him whether I could use his class in my programs and I got a positive answer. I chose this class as base for my CWaveFile (as you can see, by protected inheritance), so I've got all the functionality I need.

Using CWaveFile

It is very easy to use my class in your projects. I would like to briefly describe the interface. CWaveFile has only one constructor with an argument. The argument is a path to the WAVE file. If something goes wrong, CWaveFile will generate a C++ exception; all exceptions are grouped in the following namespace:

namespace WaveErrors {
  class FileOperation {};    //something wrong with the file
                             //(cannot be opened or something)
  class RiffDoesntMatch {};
  class WaveDoesntMatch {};
  class FmtDoesntMatch {};
  class DataDoesntMatch {};
}

The rest of the exceptions occur if some identifiers don't match. By the way, I found a very interesting feature of WAVE files written by Microsoft "Sound Recorder;" the data in these files shifted on 6 bytes, so it starts from 50 bytes. So, I had to foresee this, and I did something like this:

inline void CWaveFile::ReadDATA()
{
  try {

    pDATA = reinterpret_cast> DATA* >( dataAddress );
    if( !CheckID( pDATA->dataID, 'd', 'a', 't', 'a') ) {
      throw WaveErrors::DataDoesntMatch();
    }

  }catch( WaveErrors::DataDoesntMatch & ) {
    //something strange! In Microsoft WAVE files, the DATA
    //identifier can be offset (maybe because of address alignment)
    //Start to looking DATA_ID "manually" ;)
    PBYTE b = Base();
    BOOL foundData = FALSE;
    while(  (dataAddress - b) !=  dwSize ) {
      if( *dataAddress == 'd' ) {
        //It can be DATA_ID, check it!
        pDATA = reinterpret_cast< DATA * >( dataAddress );
        if( CheckID( pDATA->dataID, 'd','a','t','a' ) ) {
          //DATA_ID was found
          foundData = TRUE;
          break;
        }
      }
      dataAddress++;
    }
    if( !foundData ) {
      //This file may be corrupted
      throw WaveErrors::DataDoesntMatch();
    }
  }
}

So, as I mentioned in a comment, if the data identifier doesn't match, the function generates an exception and catches it by itself. It then tries to find a data identifier; if it does, everything is fine. If it doesn't—sorry, the file probably is corrupted.

ReadDATA() is one of three private functions that is in charge of reading the WAVE header and checking all identifiers. They are all grouped in the ReadWave function:

void CWaveFile::ReadWave()
{
  ReadRIFF();
  //Move to next block
    dataAddress += sizeof( *pRIFF );
  ReadFMT();
  //Move to next block
  dataAddress += sizeof( *pFMT );
  ReadDATA();
  dataAddress += sizeof( *pDATA );
  //Wave has been read!
}

As you can see, we start to read the RIFF block; if everything is OK—no exceptions were generated—we move the dataAddress pointer to the next block. Don't forget that we are using a file mapped in memory, the constructor of CWaveFile dataAddress pointer has been initialized by the base address of mapped file, and reading is like a reading of complex data structure in memory. Cool, isn't it?) Then, we read the FMT and DATA blocks, so when we leave the ReadFile function, dataAddress points to audio data (as I said for programmers, digital audio audio data is an array of 8- or 16-bit numbers), I use such a typedef:

typedef short          AudioWord;
typedef unsigned char  AudioByte;

It is time to check CWaveFile with a simple example. Let's open a WAVE file and read all audio information that is stored in it.

#include <iostream>
using namespace std;

#include "CWaveFile.h"

int main()
{
  try {

    CWaveFile wave("noise.wav");
    WAVEFORM *format = wave.GetWaveFormat();
    cout << "Format: " << format->wFormatTag << endl;
    cout << "Samples per second: " << format->nSamplesPerSec
         << endl;
    cout << "Channels: " << format->nChannels << endl;
    cout << "Bit per sample: " << format->wBitsPerSample << endl;
    if( format->wBitsPerSample == 16 ) {
      AudioWord *buffer = reinterpret_cast< AudioWord * >
                          ( wave.GetData() );
      DATA *data = wave.GetWaveData();
      DWORD samples = data->dataSIZE / sizeof(AudioData); 
      cout << "Samples number: " << samples << endl;
      cin.get();
      for( DWORD p = 0; p < samples; p++ ) {
        cout << p << ": " << buffer[p] << endl;
      }
    }

  }catch(WaveErrors::FileOperation & ) {
    cout << "File operation error!\n";
  }catch(WaveErrors::RiffDoesntMatch & ) {
    cout << "Riff doesn't match!\n";
  }catch(WaveErrors::WaveDoesntMatch & ) {
    cout << "Wave doesn't match!\n";
  }catch(WaveErrors::DataDoesntMatch & ) {
    cout << "Data doesn't match!\n";
  }
  return 0;
}

This is a simple console application, but it does a lot of work! In my case, I got something like this:

  Format: 1
  Samples per second: 22050
  Channels: 1
  Bits per sample: 16
  Samples number: 174680

First of all, it checks "noise.wav" for validation. If everything is fine, it continues to work. (Note: The Format descriptor equals 1; it is in PCM format, the simplest format of digital sound because it is not compressed. I work with this kind of files; you can easily use CWaveFile with other formats, but you have to be concerned about data interpretation by yourself.) Samples per second (or sampling frequency): 22050. Channels: 1 = mono sound. Bits per sample: 16. I needed this information because I have to know how I should interpret the data. I use reinterpret_cast to cast LPVOID, which is returned by the GetData() function to AudioData*. The most important moment here is the size of the data.

DATA *data = wave.GetWaveData();
DWORD samples = data->dataSIZE / sizeof(AudioData);

You have to use this with code to get information about the size of data; the dataSIZE field contains the size of data in bytes, but we know that we are currently working with 16-bit audio,

if( format->wBitsPerSample == 16 ) {

so we have to divide dataSIZE by sizeof(AudioData) (or just 2, 16-bit is 2 bytes). Then, you do what you need to do with audio data; I just output it to the console. Notice how easily you have done all these things, and, as I promised, you get your array with data.

Displaying the Data

No doubt, this application works perfectly, but it is just a console application. Now it's time to get all the advantages of the Windows GUI. The most important question for the person who works this any type of data (audio, radiotechnical, statistical) is: What will this data look like? Now, we are going to answer this question.

As you can see, the CWaveFile class is inherited from CObject. I did it because this class will be used in an MFC application. So, if you want to see something like this in your application, you should read this part of the article.



Click here for a larger image.

CWaveFile has the

BOOL DrawData( CDC *pDC, RECT *pRect, CSize *pNewSize )
function member.This function is in charge of drawing data on device context. For the best understanding of how to use it, I'll give you an example:

void CAnalyseView::OnDraw(CDC* pDC )
{
  CAnalyseDoc* pDoc = GetDocument();
  ASSERT_VALID(pDoc);

  // TODO: add draw code for native data here
  pDC->SaveDC();
  CRect rect;
  CBrush brush( RGB( 150, 200, 230 ) );
  GetClientRect( &rect );
  FillRect( *pDC, &rect, brush );

  pDC->MoveTo( rect.left, rect.bottom/2 );
  CWaveFile wave(pDoc->m_fileName);
  wave.DrawData( pDC, &rect, &m_szSize );

  pDC->MoveTo( rect.left, rect.bottom/2 );
  pDC->LineTo(rect.right, rect.bottom/2 ); 
  pDC->RestoreDC(-1);
}

This is the usual OnDraw function. The DrawData function takes three arguments: pointer to CDC object, pointer to CRect structure, and pointer to CSize structure. You can get it in the OnSize method, as shown in the following code:

void CAnalyseView::OnSize(UINT nType, int cx, int cy)
{
  CView::OnSize(nType, cx, cy);

  // TODO: Add your message handler code here
  if( cx == 0 || cy == 0 ) return;
  m_szSize = CSize( cx, cy );
}

That's all! But, I forgot to tell you one disadvantage of the DrawData method. It correctly displays mono signals only. Stereo signals differ from mono in that the samples in stereo go one after another (left channel, right channel, left channel, right channel, and so forth). So, it is no problem to display it, too.

Conclusions

I would like to say thank you to everyone who read this article to the end because I wrote it for you! I know that CWaveFile is not finished yet; if you add some more functionality for it (or find some bugs; that's more probable), please let me know. See you on the Net!

Downloads

Download demo project - 79 Kb
Download source - 4 Kb