We are currently using Google's Cloud Dataflow SDK (1.6.0) to run dataflow jobs in GCP, however, we are considering moving to the Apache Beam SDK (0.1.0). We will still be running our jobs in GCP using the dataflow service. Has anyone gone through this transition and have advice? Are there any compatibility issues here and is this move encouraged by GCP?
Formally Beam is not yet supported on Dataflow (although that is certainly what we are working towards). We recommend staying with the Dataflow SDK, especially if SLA or support are important to you. that said, our tests show that Beam runs on Dataflow, and although that may break at any time, you are certainly welcome to attempt at your own risk.
Update: The Dataflow SDKs are now based on Beam as of the release of Dataflow SDK 2.0 (https://cloud.google.com/dataflow/release-notes/release-notes-java-2). Both Beam and the Dataflow SDKs are currently supported on Cloud Dataflow.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With