Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Amazon Textract returns different results returns different results between WebApp Demo, AnalyzeDocumentRequest and StartDocumentAnalysisRequest

this is my first question on StackOverFlow, I would like to extract key-value pairs (FORMS) from a (scanned) PDf document via Amazon Textract. What I have noticed, however, is that some key-value pairs returned by the webapp demo (https://us-east-2.console.aws.amazon.com/textract/home?region=us-east-2#/demo) are absent from the methods that can be implemented in the code.

Furthermore, between these two methods, the Synchronous method (AnalyzeDocumentRequest), which does not accept PDF but forces a pre-conversion of the document into an image, in turn finds key-value pairs (Sync Result Example) which the Asynchronous method does not. (Async Result Example)

The problem is similar to this guy's, when he talks about the difference in results between the two methods of analyzing the document : AWS Textract - GetDocumentAnalysisRequest only returns correct results for first page of document

The code implementation is equal to these example:

  • Synchronous Method: https://docs.aws.amazon.com/textract/latest/dg/examples-extract-kvp.html
  • Asynchronous Method: https://github.com/awsdocs/amazon-textract-developer-guide/blob/master/doc_source/async-analyzing-with-sqs.md

Has anyone ever had the same problem?

like image 908
the_nibble Avatar asked Nov 02 '25 21:11

the_nibble


1 Answers

We had this problem recently. The demo website provided by AWS found 50 fields, our own code using the provided API yielded 30 fields.

After some trial land error and a lot of googling we found that the response returned by GetDocumentAnalysisAsync included a NextToken which is used to ask for more results. Turns out we had to call GetDocumentAnalysisAsync again with this token (rinse-and-repeat) until the result response no longer included a NextToken.

At that point we knew we had all the data.

like image 110
Jonathan Smith Avatar answered Nov 05 '25 15:11

Jonathan Smith



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!