We have a NodeJS API hosted on Google Kubernetes Engine, and we'd like to start logging events into BigQuery.
I can see 3 different ways of doing that :
For this particular use case, we don't need to do any transforms and will just send events straight into the right format. But we may later have other use cases where we'll need to sync tables from our main datastore (MySQL) into BQ for analytics, so maybe starting with Dataflow straight away is worth it ?
A few questions :
To stream data into BigQuery, you need the following IAM permissions: bigquery. tables. updateData (lets you insert data into the table)
Benefits of Dataflow ShuffleFaster execution time of batch pipelines for the majority of pipeline job types. A reduction in consumed CPU, memory, and Persistent Disk storage resources on the worker VMs. Better autoscaling since VMs no longer hold any shuffle data and can therefore be scaled down earlier.
The user sends a streaming insert into BigQuery via the tabledata. insertAll method. This insert is sent to the API in JSON format, along with other details such as authorization headers and details about the intended destination. A single insertAll call may have one or more individual records within it.
It provides a simplified pipeline development environment that uses the Apache Beam SDK to transform incoming data and then output the transformed data. If you want to write messages to BigQuery directly, without configuring Dataflow to provide data transformation, use a Pub/Sub BigQuery subscription.
For Option 2, Yes there is a preset called a Google-provided Template that facilitates movement of data from PubSub to BigQuery without having to write any code.
You can learn more about how to use this Google-provided Template, and others, in the Cloud Dataflow documentation.
Another option is to export the logs using a log sink. Right from the Stackdriver Logging UI, you can specify BigQuery (or other destinations) for your logs. Since your Node API is running in Kubernetes, you just need to log messages to stdout and they'll automatically get written to Stackdriver.
Reference: https://cloud.google.com/logging/docs/export/configure_export_v2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With