I want transcribe longer audio files (at least 5 minutes) using REST APIs from Microsoft. There are a lot of different products and names, e.g. Speech service API or Bing Speech API. None of the REST APIs I tried so far supports transcribing longer audio files.
The documentation states there is a REST API exactly for this case: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/batch-transcription
What is the endpoint for this service?
There is a sample available on GitHub here: https://github.com/PanosPeriorellis/Speech_Service-BatchTranscriptionAPI
The endpoint is CRIS's endpoint, as in this code:
private const string HostName = "cris.ai";
// ...
var client = CrisClient.CreateApiV2Client(SubscriptionKey, HostName, Port);
Then I found on the documentation that the API is exposed on Swagger (link visible here), so it's easier to explore the methods available (switch from 2.0beta to 2.0 on top):
So to create a new transcription, the path is: /api/speechtotext/v2.0/transcriptions
, called with the POST method, so the full endpoint is:
Please note that the level of your subscription key needed to use the transcription must be a Standard
level pricing S0
, not Free
one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With