Running a small python code to create a pandas dataframe from Bigquery table results . When i run the code I see the below results. The db_dtypes is already installed , not sure what other dependencies i need to add. Any help is appreciated.
Here is the code
import pandas
from google.cloud import bigquery
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(
'/Users/kar/Downloads/data-4045ff698b4f.json')
project_id = 'data-platform'
client = bigquery.Client(credentials=credentials, project=project_id)
sql = """SELECT * FROM `data-platform.airbnb.raw_hosts` LIMIT 1"""
query_job = client.query(sql)
df = query_job.to_dataframe()
Error
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/ka/PycharmProjects/pythonProject4/main.py", line 17, in <module>
df = query_job.to_dataframe()
File "/Users/ka/PycharmProjects/pythonProject4/venv/lib/python3.7/site-packages/google/cloud/bigquery/job/query.py", line 1689, in to_dataframe
geography_as_object=geography_as_object,
File "/Users/ka/PycharmProjects/pythonProject4/venv/lib/python3.7/site-packages/google/cloud/bigquery/table.py", line 1965, in to_dataframe
_pandas_helpers.verify_pandas_imports()
File "/Users/ka/PycharmProjects/pythonProject4/venv/lib/python3.7/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 991, in verify_pandas_imports
raise ValueError(_NO_DB_TYPES_ERROR) from db_dtypes_import_exception
ValueError: Please install the 'db-dtypes' package to use this function.
Process finished with exit code 1
After running pip install db-dtypes it worked for me.
Importing BigQuery tables into pandas appears to be such a pain, that there is a pandas method (and associated library) for it (see pandas.read_gcp).
I recommend using this module instead of the native bigquery module.
import pandas_gbq
... # your code for getting credentials and query-string
df = pandas_gbq.read_gbq(sql, project_id=project_id, credentials=credentials, progress_bar_type=None)
For me using this library, I can convert the response into a data frame but when converting the data frame into pickle format and then reimporting it to an environment without pandas_gbq it produces the same error...
I assume that some meta-data can not be displayed properly by the vanilla pandas module.
Edit: By looking at the comments one can find out that the error can be easily mended by simply doing pip install db-dtypes as @DazWilkin and @Nestor Ceniza Jr suggest (and restarting the kernel if you're using jupyter notebooks as @dss does)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With