Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Windows Media Foundation recording audio

I'm using the windows media foundation api to enumerate both my microphones and available cameras, which both work.

Here is my enumeration code:

class deviceInput {
public:
    deviceInput( REFGUID source );
    ~deviceInput();

    int listDevices(bool refresh = false);
    IMFActivate *getDevice(unsigned int deviceId);
    const WCHAR *getDeviceName(unsigned int deviceId);

private:
    void Clear();
    HRESULT EnumerateDevices();

    UINT32      m_count;
    IMFActivate **m_devices;
    REFGUID     m_source;
};

deviceInput::deviceInput( REFGUID source )
    : m_devices( NULL )
    , m_count( 0 )
    , m_source( source )
{   }

deviceInput::~deviceInput()
{
    Clear();
}

int deviceInput::listDevices(bool refresh)
{
    if ( refresh || !m_devices ) {
        if ( FAILED(this->EnumerateDevices()) ) return -1;
    }
    return m_count;
}

IMFActivate *deviceInput::getDevice(unsigned int deviceId)
{
    if ( deviceId >= m_count ) return NULL;

    IMFActivate *device = m_devices[deviceId];
    device->AddRef();

    return device;
}

const WCHAR *deviceInput::getDeviceName(unsigned int deviceId)
{
    if ( deviceId >= m_count ) return NULL;

    HRESULT hr = S_OK;
    WCHAR *devName = NULL;
    UINT32 length;

    hr = m_devices[deviceId]->GetAllocatedString( MF_DEVSOURCE_ATTRIBUTE_FRIENDLY_NAME, &devName, &length );
    if ( FAILED(hr) ) return NULL;

    return devName;
}

void deviceInput::Clear()
{
    if ( m_devices ) {
        for (UINT32 i = 0; i < m_count; i++) SafeRelease( &m_devices[i] );
        CoTaskMemFree( m_devices );
    }
    m_devices = NULL;
    m_count = 0;
}

HRESULT deviceInput::EnumerateDevices()
{
    HRESULT hr = S_OK;
    IMFAttributes *pAttributes = NULL;

    Clear();

    hr = MFCreateAttributes(&pAttributes, 1);
    if ( SUCCEEDED(hr) ) hr = pAttributes->SetGUID( MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE, m_source );
    if ( SUCCEEDED(hr) ) hr = MFEnumDeviceSources( pAttributes, &m_devices, &m_count );

    SafeRelease( &pAttributes );

    return hr;
}

To grab audio or camera capture devices, I specify either MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_AUDCAP_GUID or MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_GUID and that works no problem, and I can grab the names of the devices, as well as the IMFActivate. I have code to record the webcam to an output video file, however, I'm having a tough time figuring out how to record the audio to a file. I'm under the impression that I need to use an IMFSinkWriter, but I can't find any examples that use an audio capture IMFActivate and IMFSinkWriter.

I'm not much of a windows api programmer, so I'm sure there's a fairly straight forward answer, but COM stuff is just a bit over my head. As far as audio format, I don't really care, as long as it gets into a file - can be wav, wma, or whatever. Even though I'm recording video, I need the video and audio files separate, so I can't just figure out how to add the audio into my video encoding.

like image 997
OzBarry Avatar asked Oct 16 '12 14:10

OzBarry


People also ask

What is Media Foundation playback for Windows?

It is the intended replacement for Microsoft DirectShow, Windows Media SDK, DirectX Media Objects (DMOs) and all other so-called "legacy" multimedia APIs such as Audio Compression Manager (ACM) and Video for Windows (VfW).

How do I use Windows Media Foundation?

To get started with Media Foundation you might want to read this. Media Foundation was first introduced in Windows Vista. Thus, you must have Vista or later versions of Windows to develop Media Foundation apps. If you are targeting Windows XP users then, D-Show might be a better option.

What is Media Foundation Windows 10?

Media Foundation is the next generation multimedia platform for Windows that enables developers, consumers, and content providers to embrace the new wave of premium content with enhanced robustness, unparalleled quality, and seamless interoperability.


1 Answers

I apologize for the late response, and I hope you can still find this valuable. I recently completed a project similar to yours (recording webcam video along with a selected microphone to a single video file with audio). The key is to creating an aggregate media source.

// http://msdn.microsoft.com/en-us/library/windows/desktop/dd388085(v=vs.85).aspx
HRESULT CreateAggregateMediaSource(IMFMediaSource *videoSource,
                                   IMFMediaSource *audioSource,
                                   IMFMediaSource **aggregateSource)
{
    *aggregateSource = nullptr;
    IMFCollection *pCollection = nullptr;

    HRESULT hr = ::MFCreateCollection(&pCollection);

    if (S_OK == hr)
        hr = pCollection->AddElement(videoSource);

    if (S_OK == hr)
        hr = pCollection->AddElement(audioSource);

    if (S_OK == hr)
        hr = ::MFCreateAggregateSource(pCollection, aggregateSource);

    SafeRelease(&pCollection);
    return hr;
}

When configuring the sink writer, you will add 2 streams (one for audio and one for video). Of course, you will also configure the writer correctly for the input stream types.

HRESULT        hr                  = S_OK;
IMFMediaType  *videoInputType      = nullptr;
IMFMediaType  *videoOutputType     = nullptr;
DWORD          videoOutStreamIndex = 0u;
DWORD          audioOutStreamIndex = 0u;
IMFSinkWriter *writer              = nullptr;

// [other create and configure writer]

if (S_OK == hr))
    hr = writer->AddStream(videoOutputType, &videoOutStreamIndex);    

// [more configuration code]

if (S_OK == hr)
    hr = writer->AddStream(audioOutputType, &audioOutStreamIndex);

Then when reading the samples, you will need to pay close attention to the reader streamIndex, and sending them to the writer appropriately. You will also need to pay close attention to the format that the codec expects. For instance, IEEE float vs PCM, etc. Good luck, and I hope it is not too late.

like image 57
Jeff Avatar answered Oct 12 '22 09:10

Jeff