I changed apache-beam SDK from 2.5.0 to 2.12.0 and got the Avro schema error when reading the table from Big Query in beam using python.
The BQ table has one TIMESTAMP field, others are STRING.
data = pipe \
| 'read bigquery' >> beam.io.Read(
beam.io.BigQuerySource(
dataset=args.dataset_name,
table=args.table_name,
use_standard_sql=True))
Error:
SchemaParseException: Type property "[u'null', {u'logicalType': u'timestamp-micros', u'type': u'long'}]" not a valid Avro schema: Union item must be a valid Avro schema: Currently does not support timestamp-micros logical type
Packages installed:
python=2.7.0, apache-beam=2.12.0, avro=1.9.0
This is a regression in avro 1.9.0. The issue tracker for this is here: https://issues.apache.org/jira/browse/AVRO-2429
If you are on python 2 you should be able to downgrade to 1.8.2 by doing pip install "avro==1.8.2". If you are on python 3 I believe beam should try using fastavro by default (which should not have the bug you are running into).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With