Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Amazon Transcribe Streaming API without SDK

I am trying to use Amazon's new streaming transcribe API from Go 1.11. Currently Amazon provides Java SDK only so I am trying the low-level way.

The only relevant piece of documentation is here but it does not show the endpoint. I have found it in a Java example that it is https://transcribestreaming.<region>.amazonaws.com and I am trying the Ireland region i.e. https://transcribestreaming.eu-west-1.amazonaws.com. Here is my code to open an HTTP/2 bi-directional stream:

import (
    "crypto/tls"
    "github.com/aws/aws-sdk-go-v2/aws"
    "github.com/aws/aws-sdk-go-v2/aws/external"
    "github.com/aws/aws-sdk-go-v2/aws/signer/v4"
    "golang.org/x/net/http2"
    "io"
    "io/ioutil"
    "log"
    "net/http"
    "os"
    "time"
)

const (
    HeaderKeyLanguageCode   = "x-amzn-transcribe-language-code"  // en-US
    HeaderKeyMediaEncoding  = "x-amzn-transcribe-media-encoding" // pcm only
    HeaderKeySampleRate     = "x-amzn-transcribe-sample-rate"    // 8000, 16000 ... 48000
    HeaderKeySessionId      = "x-amzn-transcribe-session-id"     // For retrying a session. Pattern: [a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}
    HeaderKeyVocabularyName = "x-amzn-transcribe-vocabulary-name"
    HeaderKeyRequestId = "x-amzn-request-id"
)

...

region := "eu-west-1"

cfg, err := external.LoadDefaultAWSConfig(aws.Config{
    Region: region,
})
if err != nil {
    log.Printf("could not load default AWS config: %v", err)
    return
}

signer := v4.NewSigner(cfg.Credentials)

transport := &http2.Transport{
    TLSClientConfig: &tls.Config{
        // allow insecure just for debugging
        InsecureSkipVerify: true,
    },
}
client := &http.Client{
    Transport: transport,
}

signTime := time.Now()

header := http.Header{}
header.Set(HeaderKeyLanguageCode, "en-US")
header.Set(HeaderKeyMediaEncoding, "pcm")
header.Set(HeaderKeySampleRate, "16000")
header.Set("Content-type", "application/json")

// Bi-directional streaming via a pipe.
pr, pw := io.Pipe()

req, err := http.NewRequest(http.MethodPost, "https://transcribestreaming.eu-west-1.amazonaws.com/stream-transcription", ioutil.NopCloser(pr))
if err != nil {
    log.Printf("err: %+v", err)
    return
}
req.Header = header

_, err = signer.Sign(req, nil, "transcribe", region, signTime)
if err != nil {
    log.Printf("problem signing headers: %+v", err)
    return
}

// This freezes and ends after 5 minutes with "unexpected EOF".
res, err := client.Do(req)
...

Problem is that executing the request (client.Do(req)) freezes for five minutes and then ends with the "unexpected EOF" error.

Any ideas what I am doing wrong? Did someone successfully use the new streaming transcribe API without the Java SDK?

EDIT (March 11, 2019):

I tested this again and now it does not time out but immediately returns 200 OK response. There is an "exception" in the response body though: {"Output":{"__type":"com.amazon.coral.service#SerializationException"},"Version":"1.0"}

I tried opening the HTTP2 stream with io.Pipe (like the code above) and also with a JSON body described in the documentation:

{
    "AudioStream": { 
        "AudioEvent": { 
            "AudioChunk": ""
        }
    }
}

The result was the same.

EDIT (March 13, 2019):

As mentioned by @gpeng, removing the content-type from headers will fix the SerializationException. But then there is an IAM exception and it is needed to add the transcription:StartStreamTranscription permission to your IAM user. That is though nowhere in the AWS IAM console and must be added manually as a custom JSON permission :/

There is also a new/another documentation document here which shows incorrect host and a new content-type (do not use that content-type, the request will return 404 with it).

After removing the content-type, and adding the new permission, now I am getting an exception {"Message":"A complete signal was sent without the preceding empty frame."}. Also writing to the pipe blocks forever, so I am stuck again. The messages described in the new documentation are different than in the old one, now finally binary, but I do not understand them. Any ideas how to send such HTTP2 messages in Go?

EDIT (Match 15, 2019):*

If you get HTTP 403 error about signature mismatch, then do not set the transfer-encoding and x-amz-content-sha256 HTTP headers. When I set them, sign the request with AWS SDK's V4 signer, then I receive HTTP 403 The request signature we calculated does not match the signature you provided.

like image 922
shelll Avatar asked Dec 12 '18 13:12

shelll


2 Answers

I reached out to AWS support and they now recommend using websockets instead of HTTP/2 when possible (blog post here)

If this fits your usecase I would highly recommend checking out the new example repo at: https://github.com/aws-samples/amazon-transcribe-websocket-static which shows a browser-based solution in JS.

I've also noticed that the author of the demo has an express example on his personal Github at: https://github.com/brandonmwest/amazon-transcribe-websocket-express but I haven't confirmed if this is working.

Appreciate these examples aren't in Python but I think you'll have better luck using the Websocket client as opposed to HTTP/2 (which let's be honest, is still a bit terrifying :P)

like image 160
Calvin Avatar answered Oct 01 '22 02:10

Calvin


Try not setting the content type header and see what response you get. I'm trying to do the same thing (but in Ruby) and that 'fixed' the SerializationException. Still can't get it to work but I've now got a new error to think about :)

UPDATE: I have got it working now. My issue was with the signature. If both host and authority headers are passed they are joined with , and treated as host on the server side when the signature is checked so the signatures never match. That doesn't seem like correct behaviour on the AWS side but it doesn't look like it's going to be an issue for you in Go.

like image 27
gpeng Avatar answered Oct 01 '22 03:10

gpeng