Below is the code to convert BigQuery results into Pandas data frame. Im learning Python&Pandas and wonder if i can get suggestion/ideas about any kind of improvements to the code?
#...code to run query, that returns 3 columns: 'date' DATE, 'currency' STRING,'rate' FLOAT...
rows, total_count, token = query.fetch_data()
currency = []
rate = []
dates = []
for row in rows:
dates.append(row[0])
currency.append(row[1])
rate.append(row[2])
dict = {
'currency' : currency,
'date' : dates,
'rate' : rate
}
df2 = pd.DataFrame(dict)
df2['date'] = pd.to_datetime(df2['date'])
df2 = df2.set_index('date')
The above works. But looks chunky. Is there any way to do the same thing more efficiently than the above? I tried libraries such as sqlalchemy but they do not support BigQuery. And generally my question is about code and syntax above.
You should use read_gbq()
instead: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_gbq.html
Per the Using BigQuery with Pandas page in the Google Cloud Client Library for Python:
As of version 0.29.0, you can use the to_dataframe() function to retrieve query results or table rows as a pandas.DataFrame.
Aside: See Migrating from pandas-gbq for the difference between the google-cloud-bigquery
BQ Python client library and pandas-gbq
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With