Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I make integration tests with google cloud bigquery

We are in the processing of migrating from apache hbase to bigquery.

Currently we have end to end tests (using cucumbers) that work with a docker container running hbase.

There don't seem to be any bigquery docker containers or emulators (https://cloud.google.com/sdk/gcloud/reference/beta/emulators/)

How would we be able to create end to end tests for an application working with bigquery?

like image 728
Richard Deurwaarder Avatar asked Mar 17 '18 12:03

Richard Deurwaarder


People also ask

What is BigQuery integration?

BigQuery is a fully managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence.

Is BigQuery API free?

In addition, BigQuery has free operations and a free usage tier. Each project that you create has a billing account attached to it. Any charges incurred by BigQuery jobs run in the project are billed to the attached billing account. BigQuery storage charges are also billed to the attached billing account.


2 Answers

Currently there is not any kind of BigQuery local emulator or anything similar to that. As pointed out by the link you shared about available GCP emulators, there are some other products that have such a feature, but probably the reason why BigQuery does not have one is that its true potential is only seen when working in its real infrastructure, plus the fact that the costs of working with BigQuery can be relatively low depending on the usage you make of it, plus you have a Free Tier to start working with.

Let me summarize some info about BigQuery pricing that can be useful for you:

  • BigQuery storage and operation costs are summarized in the pricing documentation.
  • BigQuery offers some operations that are free of charge.
  • There's a Storage free tier with 10GB of free storage. It may not be a lot, given that BQ is designed to work with enormous amounts of data, but it can be a good starting point to do some tests.
  • There's also an Operations free tier, where the first TeraByte of processed data (per month) is free of charge.
  • You can set up alerts in order to monitor usage with Stackdriver, using the available metrics.

In any case, if you still think that working with BigQuery directly is not the best option for you, can always forward your requests to the Engineering team by creating a Feature Request in the Public Issue Tracker for BigQuery, although it will be in hands of the engineering team whether to decide if (and when) to implement such a feature, even more considering the complexity of BigQuery and that its performance is optimized for working in its current architecture.

like image 155
dsesto Avatar answered Sep 26 '22 15:09

dsesto


This is an old post but if you can use Python and you plan to test your SQL and assert your result based on input, I would suggest bq-test-kit. This framework allows you to interact with BigQuery in Python and make tests reliables.

You have 3 ways to inject data into it :

  • Create datasets and tables with an ability to isolate their name and therefore have your own namespace
  • Rely on temp tables, where data is inserted with data literals
  • data literal merged into your query

Hope that this helps.

like image 27
Bounkong KHAMPHOUSONE Avatar answered Sep 26 '22 15:09

Bounkong KHAMPHOUSONE