I imported the SAPI type library into Delphi. I can output speech to the PC speakers with this code:
procedure TForm1.Button1Click(Sender: TObject);
var
Voice: TSpVoice;
begin
Voice := TSpVoice.Create(nil);
Voice.Speak('Hello World!', 0);
end;
I can output speech to a .wav
file with this code:
procedure TForm1.Button1Click(Sender: TObject);
var
Voice: TSpVoice;
Stream: TSpFileStream;
begin
Voice := TSpVoice.Create(nil);
Stream := TSpFileStream.Create(nil);
Stream.Open('c:\temp\test.wav', SSFMCreateForWrite, False);
Voice.AudioOutputStream := Stream.DefaultInterface;
Voice.Speak('Hello World!', 0);
Stream.Close;
end;
The problem is that when I play back the .wav
file it sounds terrible, like it's using a really low bitrate. Audacity tells me the file is mono 16-bit 22.05kHz but it sounds much worse than that.
How do I output speech to a mono 16-bit 44.1kHz .wav
file that will sound exactly the same as speech output directly to the PC speakers? I could not figure out how to modify the second code sample to set the bits per sample and the bitrate.
Follup-up: Glenn's answer solves the bitrate issue. Thanks for that. But the quality of the speech output to the .wav
file is still inferior to what is output directly to the speakers. I used screen recording software to record the output from the first block of code as helloworldtospeakers.wav. The second block of code, with Glenn's line added, produces helloworldtowav.wav. The second file clearly has some distortion to it. Any ideas?
See the Format attribute on your file stream object. It's an SpAudioFormat type which has a Type property you use to set the audio format. That's an enumerated type, which has a great many options, so you'll need to study them to get what you want.
This line should get it for you (at least with the version of type library I used).
Stream.Format.Type_ := SAFT44kHz16BitMono;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With