I have been trying to figure out how to "speak" a text into a memory buffer using Windows SAPI 5.1 but so far no success, even though it seems it should be quite simple.
There is an example of streaming the synthesized speech into a .wav file, but no examples of how to stream it to a memory buffer.
In the end I need to have the synthesized speech in a char* array in 16 kHz 16-bit little-endian PCM format. Currently I create a temp .wav file, redirect speech output there, then read it, but it seems to be a rather stupid solution.
Anyone knows how to do that?
Thanks!
Look at ISpStream::SetBaseStream. Here's a little helper:
inline HRESULT SPCreateStreamOnHGlobal(
HGLOBAL hGlobal, //Memory handle for the stream object
BOOL fDeleteOnRelease, //Whether to free memory when the object is released
const WAVEFORMATEX * pwfex, //WaveFormatEx for stream
ISpStream ** ppStream) //Address of variable to receive ISpStream pointer
{
HRESULT hr;
IStream * pMemStream;
*ppStream = NULL;
hr = ::CreateStreamOnHGlobal(hGlobal, fDeleteOnRelease, &pMemStream);
if (SUCCEEDED(hr))
{
hr = ::CoCreateInstance(CLSID_SpStream, NULL, CLSCTX_ALL, __uuidof(*ppStream), (void **)ppStream);
if (SUCCEEDED(hr))
{
hr = (*ppStream)->SetBaseStream(pMemStream, SPDFID_WaveFormatEx, pwfex);
if (FAILED(hr))
{
(*ppStream)->Release();
*ppStream = NULL;
}
}
pMemStream->Release();
}
return hr;
}
I accomplished it using the ISpStream. Use Setbasestream function of the ispstream to bind it to an istream and then set the output of ispvoice to that ispstream.
Here is my working solution if anybody wants it :
https://github.com/itsyash/MS-SAPI-demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With