IMFTransform SetInputType()/SetOutputType() fails

1

I'm trying to playback MP3 (and similar audio files) using WASAPI shared mode and a media foundation IMFSourceReader on Windows 7. From what I understand I have to use an IMFTransform between the IMFSourceReader decoding and the WASAPI playback. Everything seems fine apart from when I call SetInputType()/SetOutputType() on the IMFTransform?

The relevant snippets of code are:

MFCreateSourceReaderFromURL(...);   //  Various test mp3 files
...

sourceReader->GetCurrentMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, &reader.audioType);
//sourceReader->GetNativeMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, 0, &reader.audioType);
...

audioClient->GetMixFormat(&player.mixFormat);
...

MFCreateMediaType(&player.audioType);
MFInitMediaTypeFromWaveFormatEx(player.audioType, player.mixFormat, sizeof(WAVEFORMATEX) + player.mixFormat->cbSize);
...




hr = CoCreateInstance(CLSID_CResamplerMediaObject, NULL, CLSCTX_INPROC_SERVER, IID_IUnknown, (void**)&unknown);
ASSERT(SUCCEEDED(hr));

hr = unknown->QueryInterface(IID_PPV_ARGS(&resampler.transform));
ASSERT(SUCCEEDED(hr));
unknown->Release();

hr = resampler.transform->SetInputType(0, inType, 0);
ASSERT(hr != DMO_E_INVALIDSTREAMINDEX);
ASSERT(hr != DMO_E_TYPE_NOT_ACCEPTED);
ASSERT(SUCCEEDED(hr));          //  Fails here with hr = 0xc00d36b4

hr = resampler.transform->SetOutputType(0, outType, 0);
ASSERT(hr != DMO_E_INVALIDSTREAMINDEX);
ASSERT(hr != DMO_E_TYPE_NOT_ACCEPTED);
ASSERT(SUCCEEDED(hr));          //  Fails here with hr = 0xc00d6d60

I suspect I am misunderstanding how to negotiate the input/output IMFMediaType's between things, and also how to take into consideration that IMFTransform needs to operate on uncompressed data?

It seems odd to me the output type fails but maybe that is a knock on effect of the input type failing first - and if I try to set the output type first it fails also.

audio
ms-media-foundation
wasapi
asked on Stack Overflow Oct 29, 2020 by iam • edited Oct 29, 2020 by Roman R.

1 Answer

2

In recent versions of Windows you would probably prefer to take advantage of stock functionality which is already there for you.

When you configure Source Reader object, IMFSourceReader::SetCurrentMediaType lets you specify media type you want your data in. If you set media type compatible with WASAPI requirements, Source Reader would automatically add transform to convert the data for you.

However...

Audio resampling support was added to the source reader with Windows 8. In versions of Windows prior to Windows 8, the source reader does not support audio resampling. If you need to resample the audio in versions of Windows earlier than Windows 8, you can use the Audio Resampler DSP.

... which means that indeed you might need to manage the MFT yourself. The input media type for the MFT is supposed to be coming from IMFSourceReader::GetCurrentMediaType. To instruct source reader to use uncompressed audio you need to build a media type decoder for this type of stream would decode audio to. For example, if your file is MP3 then you would read number of channels, sampling rate and build a compatible PCM media type (or take system decoder and ask it separately for output media type, which is even a cleaner way). You would set this uncompressed audio media type using IMFSourceReader::SetCurrentMediaType. This media type would also be your input media type for audio resampler MFT. This would instruct source reader to add necessary decoders and IMFSourceReader::ReadSample would give you converted data.

Output media type for reasmpler MFT would be derived from audio format you obtained from WASAPI and converted using API calls you mentioned at the top of your code snippet.

To look the error codes up you can use this:

Also, you, generally, should be able to play audio files using Media Foundation Media Session API with smaller effort. Media Session uses the same primitives to build a playback pipeline and takes care of format fitting.

Ah so are you saying I need to create an additional object that is the decoder to fit between the IMFSourceReader and IMFTransform/Resampler?

No. By doing SetCurrentMediaType with proper media type you have Source Reader adding decoder internally so that it could give you already decompressed data. Starting with Windows 8 it is also capable to do conversion between PCM flavors, but in Windows 7 you need to take care of this yourself with Audio Resampler DSP.

You can manage decoder yourself but you don't need to since Source Reader's decoder would do the same more reliably.

You might need a separate decoder just to help you guess what PCM media type decoder would produce so that you request it from Source Reader. MFTEnumEx is proper API to look decoder up.

I am not sure how to decide on or create a suitable decoder object? Do I need to enumerate a list of suitable ones somehow rather than assume specific ones?

The mentioned MFTEnum, MFTEnumEx API calls can enumerate decoders, both all available or filtered by given criteria.

One another way is to use partial media type (see relevant explanation and code snippet here: Tutorial: Decoding Audio). Partial media type is a signal about desired format requesting that Media Foundation API supplies a primitive that matches this partial type. See comments below for related discussion links.

answered on Stack Overflow Oct 29, 2020 by Roman R. • edited Oct 31, 2020 by Roman R.

User contributions licensed under CC BY-SA 3.0