I am attempting to find a way to take synthesized speech and record it to an audio file. I am currently using pyttsx as my text-to-speech library, but there isn't a mechanism for saving the output to a file, only playing it directly from the speakers. I've looked into detecting and recording audio as well as PyAudio, but these seem to take input from a microphone rather than redirecting outgoing audio to a file. Is there a known way to do this?
You can call espeak with the -w argument using subprocess.
import subprocess
def textToWav(text,file_name):
subprocess.call(["espeak", "-w"+file_name+".wav", text])
textToWav('hello world','hello')
This will write file_name.wav without reading out loud. If your text is in a file (e.g. text.txt) you need to call espeak with the -f parameter ("-f"+text). I'd recommend reading the espeak man pages to see all the options you have.
Hope this helps.
You can use more advanced SAPI wrapper to save output to the wav file. For example you can try
https://github.com/DeepHorizons/tts
The code should look like this:
import tts.sapi
voice = tts.sapi.Sapi()
voice.set_voice("Joey")
voice.create_recording('hello.wav', "Hello")
Here is an example which gives you access to the NSSpeechSynthesizer API
#!/usr/bin/env python
from AppKit import NSSpeechSynthesizer
import sys
import Foundation
if len(sys.argv) < 2:
text = raw_input('type text to speak> ')
else:
text = sys.argv[1]
nssp = NSSpeechSynthesizer
ve = nssp.alloc().init()
ve.setRate_(100)
url = Foundation.NSURL.fileURLWithPath_('yourpath/test.aiff')
ve.startSpeakingString_toURL_(text,url)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With