Video overlay using Microsoft DirectShow

Table of contents

1. What does this article cover?
2. What is the intended audience and what are the requirements

3. Simple video stream playback sample
4. The overlay paradigm
5. Making the overlay mix
5.1 Double-buffer technique
5.2 Triple-buffer technique
6. hWnd, DC, and the virtual system coordinates
7. Piece of code


1. What does this article cover?


Though the Microsoft DirectShow SDK provides both high and low level APIs to
show video playback in a window or even fullscreen, it provides no sample showing
how to perform sprite overlay on top of the video. This article is dedicated
to this. I demonstrate how to perform sprite overlay, and more particularly
GDI polygon shape overlay, by adding a few lines of code to a sample provided
in the DirectShow SDK.

Sprite overlay on top of video means value-added content. The overlay is not
meant only to emphasize something appearing in the video, it can be interactive.
A routine could pump mouse clicks and perform specific actions, hence hypermedia
and multimedia scenariis can clearly become a reality. A routine could also
pump mouse moves from the windows message queue and allow the sprite to move,
or something else, according to user-action….


2. What is the intended audience and what are the requirements ?


This article is clearly intended to the developer community. I focus on the
DirectShow SDK and hence it may be helpful to developers interested in this
API. It may be of interest to programmers working on another multimedia API,
and to developers working on similar content concerns for a commercial software
or something.

The requirements are to have the DirectShow SDK installed on your computer.
Not only the run-time of course. Anyway, it doesn’t matter whether you have
the 5.2 version or higher, currently 6 version, since the sample program I reuse
hasn’t changed a single bit…

In order to download the DirectShow 6.0 SDK, check this link

You may also need to download the DirectX runtime in order to be able to run
projects using DirectDraw. Check for it at the same address.


3. Simple video stream playback sample


First you will have to copy/paste the source code from the DirectShow documentation.
Starting at the default page, go to the Application Developer’s guide,
then How to… then Play a Movie in a Window Using DirectDrawEx and
Multimedia Streaming.
This program is an extension of the simple ShowStream
sample provided in the SDK, which is at a higher user-level than what follows.

Make it a VC++ project : create a new workspace, select WIN32 Application target,
and then include all .h, .cpp files. Then in the
project settings, add the following library links on the Link tab : amstrmid.lib
quartz.lib strmbase.lib ddraw.lib
. The librairies dedicated to DirectShow
are strmbase.lib and amstrmid.lib (multimedia streaming, debug
version), and quartz.lib (DirectShow run-time). ddraw.lib is for

Build the stuff, and run it. From now on, take a look at the code. The playback
process consists in a few elementary steps :

  1. Create all DirectDraw surfaces aimed at receiving the video samples, a secondary
    backbuffer in Triple-buffer mode, and the primary
    display surface,
  2. Prepare a video media stream and an audio media stream,
  3. Open the video file, and apply a render procedure to create an appropriate
    Filter Graph,
  4. Run the multimedia stream,
  5. Loop and render any new video sample. Watch out COMPLETE_VIDEO flag to stop
    the multimedia stream when it has ended. Watch out keyboard, mouse, … to
    allow interaction.


4. The overlay paradigm


While the first four steps are merely initialisation steps, the fifth step
is the main loop. Our procedure will be included in this main loop.

Let’s take a look at the RenderToSurface procedure. The Update
call asks the multimedia stream to update the next video/audio sample, which
is blitted in a backbuffer. This backbuffer is attached to the stream sample
engine at init-time.

Then the content of the backbuffer is simply blitted into the primary surface,
resulting in what you see when you run the program. Note that the stretch blit
redimensions the surface so that it fits the whole client area. The redimension
is due to the fact that the video sample has a constant (width,height) size
though the client area has a stretchable size.

Drawing a sprite after this blit operation is possible. Obtain a DC
for the surface through IDirectDrawSurface->GetDC, then apply
any desired GDI operations (the hWnd can be known easily), then
release the DC. The drawback of doing that ? It flickers since it’s not really
a good idea to draw directly on the visible surface. You may wonder why the
blit operation doesn’t provoke flickers too ? It’s simple, two successive blits
are almost the same since the content almost doesn’t change much between two

Dealing with the DC, please remember that you can’t debug a program inside
a GetDC, ReleaseDC block.

We are about to see solutions for clean sprite overlay in the next paragraph.


5. Making the overlay mix


We show that the double-buffer technique is a good solution for simple playbacked
video (ie always updated, never paused), but a triple-buffer technique will
help to solve the problem of non-updated backbuffer.


5.1 Double-buffer technique


Before the blit, why don’t we draw something into the backbuffer ? Afterall
the entire content of the backbuffer is copied so it’s a good place for a sprite
overlay. And furthermore we wouldn’t draw directly on the primary surface, preventing


5.2 Triple-buffer technique


We know that we have to draw a sprite into a backbuffer. Let us call sample-backbuffer
the surface attached to the stream sample engine. As you may allow video pause,
and thus the sample-backbuffer wouldn’t be updated in that interval time, you
would draw again and again into the actual same surface, resulting on an awful
visible trace as the sprite moves. The solution is to store the sample-backbuffer
in another backbuffer, the copy-backbuffer, and actually draw in that

The update sample is blitted to the copy-backbuffer, we then draw into the
copy-backbuffer surface, and finally we blit the copy-backbuffer to the primary
surface. It all works fine except in terms of CPU time due to double blit-cake.
And when the video is paused ? The sample-backbuffer is not updated, but the
blit is still performed thus the copy-backbuffer is updated, with always the
same surface, and this is what we wanted.

To optimize CPU performance, perform a draw into the sample-backbuffer when
the video is playing, and perform a draw into the copy-backbuffer when the video
is pausing. This means a single blit most of the time.


6. hWnd, DC, and the virtual system coordinates


Everything would be ok if all three surfaces would have the same size and aspect
ratio. But it’s not the case and may be you don’t want what you draw be stretched.
In GDI terms, a blit operation is a simple copy operation, while a stretch blit
is a scaled copy operation. In DirectDraw surface terms, there is no strictly
stretch blit operation, and the blit can be either a copy or a scaled-copy (using
the hardware if hardware stretch is available).

I used the word blit in the last paragraph and didn’t stress much about stretching,
but it is now time to uncover these details.

Generally speaking, the sample-backbuffer is blitted to the copy-backbuffer.
The copy-backbuffer thus has the same width and height than the video itself
has. Then we draw in the copy-backbuffer. We finally stretch-blit the copy-backbuffer
to the primary surface which has a dynamical (width,height) size. In all three
surfaces the color-depth is the same. All in all, the drawn sprite is stretched.

To avoid streched sprite, stretch blit the sample-backbuffer to the copy-backbuffer,
then draw the sprite, then blit to the primary surface. The DC also has to be
affected since it has all physical and virtual system coordinates. To draw consistantly,
always draw the sprite using a DC for which the virtual system coordinates is
constant, for example (0,0)-(1000,1000), using the SetWindowExt
function, while the physical system coordinates is the actual copy-backbuffer
size, using SetViewportExt.


7. Piece of code


I give hereby a piece of code demonstrating what I have explained, which is
the most part of the RenderToSurface function. Copy/Paste and build.
It should work.

if (!g_bPaused)

 // update each frame

 if (g_pSample->Update(0, NULL, NULL, 0) != S_OK)
  g_bAppactive = FALSE;

// blit from the sample-offscreen surface to the copy-offscreen surface 

hr = g_pDDSOffscreen->Blt(&rect, g_pDDSOffscreenSample, &rect, DDBLT_WAIT, NULL);

if (FAILED(hr))
 AfxMessageBox("Blt failed");

// obtain the DC so we can fuck with the GDI into the copy-offscreen surface

HDC my_dc;

// virtual system coordinates / physical system coordinates

SIZE size;

// video_width and video_height are obtained at the Video media stream init-time

::SetViewportExtEx(my_dc, video_width, video_height, &size);
RECT rect2;
GetClientRect(&rect2);  // client region of the primary surface 
::SetWindowExtEx(my_dc, rect2.right-rect2.left,, &size);

HPEN my_pen=(HPEN)::CreatePen(PS_SOLID,2,RGB(255,0,0));
HPEN old_pen=(HPEN)::SelectObject(my_dc,(HPEN)my_pen);

// ...
// ... place here all GDI operations ...
// ...


// release the DC


// clear the Hstack


// finally perform a blit to the primary surface

hr = g_pPrimarySurface->Blt(&rect2, g_pDDSOffscreen, &rect, DDBLT_WAIT, NULL);

 AfxMessageBox("Blt failed"); ExitCode();

DirectShow is a product from Microsoft corporation.

Contact the author :[email protected],[email protected]


More by Author

Must Read