I have a Google bigQuery Table and I want to stream the entire table into pub-sub Topic
what should be the easy/fast way to do it?
Thank you in advance,
The Pub/Sub Subscription to BigQuery template is a streaming pipeline that reads JSON-formatted messages from a Pub/Sub subscription and writes them to a BigQuery table. You can use the template as a quick solution to move Pub/Sub data to BigQuery.
Streaming is not available through the free tier. If you attempt to use streaming without enabling billing, you receive the following error: BigQuery: Streaming insert is not allowed in the free tier.
2019 update:
Now it's really easy with a click-to-bigquery option in Pub/Sub:
Find it on: https://console.cloud.google.com/cloudpubsub/topicList
The easiest way I know of is going through Google Cloud Dataflow, which natively knows how to access BigQuery and Pub/Sub.
In theory it should be as easy as the following Python lines:
p = beam.Pipeline(options=pipeline_options)
tablerows = p | 'read' >> beam.io.Read(
beam.io.BigQuerySource('clouddataflow-readonly:samples.weather_stations'))
tablerows | 'write' >> beam.io.Write(
beam.io.PubSubSink('projects/fh-dataflow/topics/bq2pubsub-topic'))
This combination of Python/Dataflow/BigQuery/PubSub doesn't work today (Python Dataflow is in beta, but keep an eye on the changelog).
We can do the same with Java, and it works well - I just tested it. It runs either locally, and also in the hosted Dataflow runner:
Pipeline p = Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());
PCollection<TableRow> weatherData = p.apply(
BigQueryIO.Read.named("ReadWeatherStations").from("clouddataflow-readonly:samples.weather_stations"));
weatherData.apply(ParDo.named("tableRow2string").of(new DoFn<TableRow, String>() {
@Override
public void processElement(DoFn<TableRow, String>.ProcessContext c) throws Exception {
c.output(c.element().toString());
}
})).apply(PubsubIO.Write.named("WriteToPubsub").topic("projects/myproject/topics/bq2pubsub-topic"));
p.run();
Test if the messages are there with:
gcloud --project myproject beta pubsub subscriptions pull --auto-ack sub1
Hosted Dataflow screenshot:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With