Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Export Google BigQuery data to Python Pandas dataframe

I've been researching how to export BigQuery data into Pandas. There are two methods:

  1. Export the file to a CVS and load it - https://cloud.google.com/bigquery/exporting-data-from-bigquery

  2. Directly pull the data into a pandas frame. This doesn't seem to work but here is the method - pandas.io.gbq.read_gbq(query, project_id=None, index_col=None, col_order=None, reauth=False) . It appears gbq has been discontinued?

Could someone please suggest the best and most efficient way to go about this?

Thank you.

like image 414
BlackHat Avatar asked Oct 21 '14 19:10

BlackHat


People also ask

Can I export data from BigQuery?

After you've loaded your data into BigQuery, you can export the data in several formats. BigQuery can export up to 1 GB of data to a single file. If you are exporting more than 1 GB of data, you must export your data to multiple files. When you export your data to multiple files, the size of the files will vary.


1 Answers

The gbq.read_gbq method definitely works in pandas .15.0-1 as I just upgraded from .14.0-1 to check (Windows 7). If you are using Python, I would definitely recommend this for getting data into a dataframe from Google BigQuery as it is something I use for almost all my analysis work.

It is hard to say how to overcome your issue without more information, but I would start with checking if the authentication flow is completing in your browser that is logged into your Google account and then troubleshoot from there. There is a deprecation warning on your first authentication flow (oauth2client.tools.run), but everything does still work.

Other than that, I would try following the examples here: http://pandas-docs.github.io/pandas-docs-travis/io.html#io-bigquery

FYI, in the current dev branch, an option for Gcloud authentication is being added to make headless authentication more convenient.

like image 100
AzCollin Avatar answered Oct 08 '22 19:10

AzCollin