Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

text-to-speech-to-wav in Delphi

Tags:

delphi

wav

sapi

I imported the SAPI type library into Delphi. I can output speech to the PC speakers with this code:

procedure TForm1.Button1Click(Sender: TObject);
var
  Voice: TSpVoice;
begin
  Voice := TSpVoice.Create(nil);
  Voice.Speak('Hello World!', 0);
end;

I can output speech to a .wav file with this code:

procedure TForm1.Button1Click(Sender: TObject);
var
  Voice: TSpVoice;
  Stream: TSpFileStream;
begin
  Voice := TSpVoice.Create(nil);
  Stream := TSpFileStream.Create(nil);
  Stream.Open('c:\temp\test.wav', SSFMCreateForWrite, False);
  Voice.AudioOutputStream := Stream.DefaultInterface;
  Voice.Speak('Hello World!', 0);
  Stream.Close;
end;

The problem is that when I play back the .wav file it sounds terrible, like it's using a really low bitrate. Audacity tells me the file is mono 16-bit 22.05kHz but it sounds much worse than that.

How do I output speech to a mono 16-bit 44.1kHz .wav file that will sound exactly the same as speech output directly to the PC speakers? I could not figure out how to modify the second code sample to set the bits per sample and the bitrate.

Follup-up: Glenn's answer solves the bitrate issue. Thanks for that. But the quality of the speech output to the .wav file is still inferior to what is output directly to the speakers. I used screen recording software to record the output from the first block of code as helloworldtospeakers.wav. The second block of code, with Glenn's line added, produces helloworldtowav.wav. The second file clearly has some distortion to it. Any ideas?

like image 445
Jan Goyvaerts Avatar asked Oct 14 '12 04:10

Jan Goyvaerts


1 Answers

See the Format attribute on your file stream object. It's an SpAudioFormat type which has a Type property you use to set the audio format. That's an enumerated type, which has a great many options, so you'll need to study them to get what you want.

This line should get it for you (at least with the version of type library I used).

Stream.Format.Type_ := SAFT44kHz16BitMono;
like image 112
Glenn1234 Avatar answered Oct 14 '22 11:10

Glenn1234