Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Cloud Data flow jobs failing with error 'Failed to retrieve staged files: failed to retrieve worker in 3 attempts: bad MD5...'

SDK: Apache Beam SDK for Go 0.5.0

We are running Apache Beam Go SDK jobs in Google Cloud Data Flow. They had been working fine until recently when they intermittently stopped working (no changes made to code or config). The error that occurs is:

Failed to retrieve staged files: failed to retrieve worker in 3 attempts: bad MD5 for /var/opt/google/staged/worker: ..., want ; bad MD5 for /var/opt/google/staged/worker: ..., want ;

(Note: It seems as if it's missing a second hash value in the error message message.)

As best I can guess there's something wrong with the worker - It seems to be trying to compare md5 hashes of the worker and missing one of the values? I don't know exactly what it's comparing to though.

Does anybody know what could be causing this issue?

like image 800
Tim Avatar asked Dec 17 '18 22:12

Tim


1 Answers

The fix to this issue seems to have been to rebuild the worker_harness_container_image with the latest changes. I had tried this but I didn't have the latest release when I built it locally. After I pulled the latest from the Beam repo, and rebuilt the image (As per the notes here https://github.com/apache/beam/blob/master/sdks/CONTAINERS.md) and reran it seemed to work again.

like image 73
Tim Avatar answered Oct 13 '22 10:10

Tim