Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ValueError in Dataflow: Invalid GCS location: None

I'm trying to load data from GCS bucket and publish content to pubsub and bigquery. These are my pipeline options:

options = PipelineOptions(
      project = project,
      temp_location = "gs://dataflow-example-bucket6721/temp21/",
      region = 'us-east1',
      job_name = "dataflow2-pubsub-09072021",
      machine_type = 'e2-standard-2',
   )

And this is my pipeline

data = p | 'CreateData' >> beam.Create(sum([fileName()], []))

jsonFile =  data | "filterJson" >> beam.Filter(filterJsonfile)

JsonData = jsonFile | "JsonData" >> beam.Map(readFromJson)

split_data = JsonData | 'Split Data' >> ParDo(CheckForValidData()).with_outputs("ValidData", "InvalidData")

ValidData = split_data.ValidData
InvalidData = split_data.InvalidData
data_ = split_data[None]


publish_data = ValidData | "Publish msg" >> ParDo(publishMsg())

ToBQ = ValidData | "To BQ" >> beam.io.WriteToBigQuery(
            table_spec,
            #schema=table_schema,
            create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
            write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND)

The data is flowing fine in InteractiveRunner but in DataflowRunner it is showing an error like

ValueError: Invalid GCS location: None. Writing to BigQuery with FILE_LOADS method requires a GCS location to be provided to write files to be loaded into BigQuery. Please provide a GCS bucket through custom_gcs_temp_location in the constructor of WriteToBigQuery or the fallback option --temp_location, or pass method="STREAMING_INSERTS" to WriteToBigQuery. [while running '[15]: To BQ/BigQueryBatchFileLoads/GenerateFilePrefix']

It is showing error of GCS location and suggest to add temp_location. but I have already added temp_location.

like image 699
Jigna Chandarana Avatar asked Apr 23 '26 21:04

Jigna Chandarana


1 Answers

When running your Dataflow pipeline pass the argument --temp_location gs://bucket/subfolder/ (exactly in this format, create a subfolder inside the bucket) and should work.

like image 153
Vipul Mehra Avatar answered Apr 25 '26 11:04

Vipul Mehra



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!