Trying to find a package that convert my json response from the Amazon AWS Transcribe service with no luck.
You can see an example of the JSON
in the JavaScript part of the Fiddle.
I wouldn't like to take the naive approach and just "bundle" like 10 words together as that would space the captions in a weird way.
I'd even accept a programmatic way of doing it using the Google Speech service or Speechmatics. They all return a json file broken down by word.
Anyone has worked with that before?
Thanks!
Inspired from yash answer I took it and made small changes. Feel free to use it.
https://apoorv.blog/aws-transcribe-json-to-srt.html
I personally use this tool for my own purposes so expect to stay updated.
You probably would have found a way to do that or created a script. I also tried finding some ready made solution so ended up writing some JavaScript code to generate SRT from the JSON output of Amazon Transcribe.
https://www.yash.info/aws-srt-creator.htm
I am breaking sentences at period (.). It's a standalone HTML file. Feels free to download and modify as required.
I've used this python script from github and it formats really nicely into docx format. The output even includes scatterplots of the confidence levels of words as well as changing the colors to lower confidence words.
https://github.com/kibaffo33/aws_transcribe_to_docx
This worked really well for me, but I think you could have this go to html fairly simply if you wanted to alter the python script.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With