So the application that I am writing needs to stream from multiple cameras to one screen. I need to do this live, so no need to record the stream. I started doing this with Directshow .NET.
When you download the Directshow .NET library you get a Samples folder as part of the download. Within samples there is a sample called PlayCap. This sample will simply find the first webcam attached to your computer and stream from said webcam. It puts the video out to the whole window. We don’t want that though. So to change where the feed goes we are going to have to understand the code. Warning: We are about to take the scenic route to streaming from two webcams if you don’t care about the code explanation or know enough skip to Part Two ( still typing that part up as of now ) of this little series.
So let’s pop open the code… Also if you haven’t done so, get the sample to just run. That will give you an idea of what I’m talking about. You should only need to have a camera attached to your computer.
Moving on, the first thing we see in the PlayCap code is this :
// a small enum to record the graph state
enum PlayState
{
Stopped,
Paused,
Running,
Init
};
This is pretty self explanatory. It is a small enum for maintaining the status of the video, we’ll get into the “difference” between a “graph” and video in a later post and probably another series. Moving on:
// Application-defined message to notify app of filtergraph events
public const int WM_GRAPHNOTIFY = 0x8000 + 1;
IVideoWindow videoWindow = null;
IMediaControl mediaControl = null;
IMediaEventEx mediaEventEx = null;
IGraphBuilder graphBuilder = null;
ICaptureGraphBuilder2 captureGraphBuilder = null;
PlayState currentState = PlayState.Stopped;
This section is fairly simple too. The first line is a constant declaration for the graph notify event that windows will be firing for our WndProc method, we will be overriding that. The next piece again is a series of interface declarations. Their implementation can be anything, although the library provides that.
- The videoWindow member is our video display.
- The mediaControl will be maintaining/controlling the status of the video, although in this example it is playing as long as the app is running.
- The mediaEventEx member is used to capture the graph notify.
- The graphBuilder member will be putting things together for us, and provide the implementation of some the interfaces.
- The captureGraphBuilder will be used to help with capture. It may help quite a bit if I explained the different … nomenclature… maybe that will be another post… or maybe I’ll wait to post this until I explain that.
- Last but not least we declare a PlayState, and by default it is stopped. Don’t worry that will change.
Onward we go:
public Form1()
{
InitializeComponent();
CaptureVideo();
}
The form constructor calls the CaptureVideo() method to start the stream. That method is where the magic happens.
protected override void Dispose( bool disposing )
{
if( disposing )
{
// Stop capturing and release interfaces
CloseInterfaces();
}
base.Dispose( disposing );
}
This is the Form disposing method. When the form is being disposed of, we need to clean up so we don’t get any memory leaks. Now! Onto the fun part! You ready ?
public void CaptureVideo()
{
int hr = 0;
IBaseFilter sourceFilter = null;
try
{
// Get DirectShow interfaces
GetInterfaces();
// Attach the filter graph to the capture graph
hr = this.captureGraphBuilder.SetFiltergraph(this.graphBuilder);
DsError.ThrowExceptionForHR(hr);
// Use the system device enumerator and class enumerator to find
// a video capture/preview device, such as a desktop USB video camera.
sourceFilter = FindCaptureDevice();
// Add Capture filter to our graph.
hr = this.graphBuilder.AddFilter(sourceFilter, "Video Capture");
DsError.ThrowExceptionForHR(hr);
// Render the preview pin on the video capture filter
// Use this instead of this.graphBuilder.RenderFile
hr = this.captureGraphBuilder.RenderStream(PinCategory.Preview,
MediaType.Video, sourceFilter,
null,
null);
DsError.ThrowExceptionForHR(hr);
// Now that the filter has been added to the graph and we have
// rendered its stream, we can release this reference to the filter.
Marshal.ReleaseComObject(sourceFilter);
// Set video window style and position
SetupVideoWindow();
// Add our graph to the running object table, which will allow
// the GraphEdit application to "spy" on our graph
rot = new DsROTEntry(this.graphBuilder);
// Start previewing video data
hr = this.mediaControl.Run();
DsError.ThrowExceptionForHR(hr);
// Remember current state
this.currentState = PlayState.Running;
}
catch
{
MessageBox.Show("An unrecoverable error has occurred.");
}
}
So since this is where all the magic happens I am going to go line by line and then post the corresponding method if needed. That being said the first call we see is to a method called GetInterfaces(). This method just sets up your interfaces you declared above, instantiating them. The method looks as such:
public void GetInterfaces()
{
int hr = 0;
// An exception is thrown if cast fail
this.graphBuilder = (IGraphBuilder) new FilterGraph();
this.captureGraphBuilder = (ICaptureGraphBuilder2) new CaptureGraphBuilder2();
this.mediaControl = (IMediaControl) this.graphBuilder;
this.videoWindow = (IVideoWindow) this.graphBuilder;
this.mediaEventEx = (IMediaEventEx) this.graphBuilder;
hr = this.mediaEventEx.SetNotifyWindow(this.Handle, WM_GRAPHNOTIFY, IntPtr.Zero);
DsError.ThrowExceptionForHR(hr);
}
The first thing declared is the hr variable. That is an integer indicating a response being fed back from a call to a COM component. If you look toward the end of the method there is a call to DsError.ThrowExceptionForHR if your response is not zero then there was an error and the hr variable has the error code. DsError just throws an exception based on the error code. Meanwhile… back at the batcave the CaptureVideo() is still running…. The next call that is of interest is this:
// Use the system device enumerator and class enumerator to find
// a video capture/preview device, such as a desktop USB video camera.
sourceFilter = FindCaptureDevice();
This source filter is where the stream originates. That stream originates with the WebCam so we need to tie this source to our graph. The FindCaptureDevice() has two versions, one is the active one, and then there is a commented out version that uses the DsDevice helper class. I uncommented that method and decided to use it since it is easier in the long run. That method looks like this:
// Uncomment this version of FindCaptureDevice to use the DsDevice helper class
// (and comment the first version of course)
public IBaseFilter FindCaptureDevice()
{
System.Collections.ArrayList devices;
object source;
// Get all video input devices
devices = DsDevice.GetDevicesOfCat(FilterCategory.VideoInputDevice);
// Take the first device
DsDevice device = (DsDevice)devices[0];
// Bind Moniker to a filter object
Guid iid = typeof(IBaseFilter).GUID;
device.Mon.BindToObject(null, null, ref iid, out source);
// An exception is thrown if cast fail
return (IBaseFilter) source;
}
The first thing here is an ArrayList of devices being declared and a source of IBaseFilter. The devices ArrayList then is populated by making a call to DsDevice.GetDevicesofCat(FilterCategory.VideoInputDevice). This method returns devices of type VideoInputDevice. Next we grab the first device found, you could modify this method and pass in a device index if you have more than one web-cam ( hint ! ). Lastly the device is bound to our IBaseFilter source and returned.
The next part of CaptureVideo() is this:
// Add Capture filter to our graph.
hr = this.graphBuilder.AddFilter(sourceFilter, "Video Capture");
DsError.ThrowExceptionForHR(hr);
// Render the preview pin on the video capture filter
// Use this instead of this.graphBuilder.RenderFile
hr = this.captureGraphBuilder.RenderStream(PinCategory.Preview,
MediaType.Video, sourceFilter,
null,
null);
DsError.ThrowExceptionForHR(hr);
// Now that the filter has been added to the graph and we have
// rendered its stream, we can release this reference to the filter.
Marshal.ReleaseComObject(sourceFilter);
This little blob simply adds the source filter to our graph, checking for an exception code afterward. Then it asks our captureGraphBuilder to render the stream as Preview, using our sourceFilter that we built up top by calling the FindCaptureDevice() method. Finally once the stream is built the sourceFilter is released… whole memory leak thing again.
The last blob of code that is interesting is this the call to SetupVideoWindow() and SetupVideoWindow() itself of course. Here is that code:
public void SetupVideoWindow()
{
int hr = 0;
// Set the video window to be a child of the main window
hr = this.videoWindow.put_Owner(panel1.Handle);
this.videoWindow.put_MessageDrain(panel1.Handle);
DsError.ThrowExceptionForHR(hr);
hr = this.videoWindow.put_WindowStyle(WindowStyle.Child | WindowStyle.ClipChildren);
DsError.ThrowExceptionForHR(hr);
// Use helper function to position video window in client rect
// of main application window
ResizeVideoWindow();
// Make the video window visible, now that it is properly positioned
hr = this.videoWindow.put_Visible(OABool.True);
DsError.ThrowExceptionForHR(hr);
}
The first call to put_Owner() is the most interesting here, followed by a call to put_MessageDrain(). The put_Owner() call puts the video inside the container that you pass to the method. So by default the example said this.Handle, I changed it to panel1, which is my smaller panel inside the window. I have also put a message drain to the panel1 object. Which means that when you click on the video window the panels events will fire. The call to put_MessageDrain(panel1.Handle) means that events from the video window will drain to panel1. So, for example, when the WM_LBUTTONUP ( left mouse button up ) is fired by windows the panel1.MouseUp event will fire. Handled as such:
private void panel1_MouseUp(object sender, MouseEventArgs e)
{
this.textBox1.Text = e.X.ToString() + "," + e.Y.ToString() + "\r\n";
}
The next method that is noteworthy since we are after all placing our video inside of a panel is the ResizeVideoWindow() method. Now… I have my panel inside of a splitter so my method looks like this:
public void ResizeVideoWindow()
{
// Resize the video preview window to match owner window size
if (this.videoWindow != null)
{
this.videoWindow.SetWindowPosition(0, 0,
this.splitContainer1.Panel1.ClientSize.Width,
this.splitContainer1.Panel1.ClientSize.Height);
}
}
So what I am saying here is that I want the video to start at 0,0 ( top , left ) and go the whole width and height of the panel that it is sitting in.
Next the ChangePreviewState method is somewhat interesting. It is called when the video is minimized or restored. If you minimize this method will stop the feed.
public void ChangePreviewState(bool showVideo)
{
int hr = 0;
// If the media control interface isn't ready, don't call it
if (this.mediaControl == null)
return;
if (showVideo)
{
if (this.currentState != PlayState.Running)
{
// Start previewing video data
hr = this.mediaControl.Run();
this.currentState = PlayState.Running;
}
}
else
{
// Stop previewing video data
hr = this.mediaControl.StopWhenReady();
this.currentState = PlayState.Stopped;
}
}
This method is called by the Form1_Resize() event. Which resizes if needed but if it’s been minimized stops the feed by calling ChangePreviewState(false)
That’s really all I found interesting / worth explaining. The only caveat here is that I am a DirectShow newbie so … read with caution… contents may be volatile. Save often as memory leaks may happen.
In Part Two I will look at how to make our little application stream from two cameras. Also I am going to encapsulate all this stuff into a class so that we can just instantiate Camera objects and not worry about all the unmanaged code. I will post these as they are done along with the sample projects. Comments, questions, post them below.
Good luck!