Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ArrowTypeError: Did not pass numpy.dtype object', 'Conversion failed for column X with type int32

Problem

I am trying to save a data frame as a parquet file on Databricks, getting the ArrowTypeError.

Databricks Runtime Version: 7.6 ML (includes Apache Spark 3.0.1, Scala 2.12)

Log Trace

ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column inv_yr with type int32')
like image 905
Nagaraju Budigam Avatar asked May 12 '21 11:05

Nagaraju Budigam


1 Answers

The issue you are facing originates from the fact that you are using an old pyarrow wheel with the latest numpy 1.20 release. You are running into the bug "PyArray_DescrCheck doesn't work anymore if the consumer library was compiled with an older NumPy version ". Either update your pyarrow version or downgrade to numpy<1.20.

like image 167
Uwe L. Korn Avatar answered Nov 15 '22 20:11

Uwe L. Korn