I am saving data to BigQuery through a Google Dataflow Streaming job.
I want to insert this data into elastic search for rapid access.
Is it a good practice to call logstach from dataflow through http?
Dataflow has two data pipeline types: streaming and batch. Both types of pipelines run jobs that are defined in Dataflow templates. A streaming data pipeline runs a Dataflow streaming job immediately after it is created. A batch data pipeline runs a Dataflow batch job on a user-defined schedule.
Unified stream and batch data processing that's serverless, fast, and cost-effective. New customers get $300 in free credits to spend on Dataflow. Learn Dataflow in a minute, including how it works and common use cases.
Benefits of Dataflow ShuffleFaster execution time of batch pipelines for the majority of pipeline job types. A reduction in consumed CPU, memory, and Persistent Disk storage resources on the worker VMs. Better autoscaling since VMs no longer hold any shuffle data and can therefore be scaled down earlier.
The Apache Beam Java SDK has a connector to read from/write to Elastic search. This should optimize the IO to be consistent with the Beam model.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With