I am breaking my head over this right now. I am new to this <code>parquet</code> files, and I am running into a LOT of issues with it. I am thrown an error that reads <code>OSError: Passed non-file path: \datasets\proj\train\train.parquet</code> each time I try to create a <code>df</code> from it. I've tried this: <code>pq.read_pandas(r'E:\datasets\proj\train\train.parquet').to_pandas()</code> AND <code>od = pd.read_parquet(r'E:\datasets\proj\train\train.parquet', engine='pyarrow')</code> I also changed the drive letter of the drive the dataset resides, and it's the SAME THING! It's the same with all engines. PLEASE HELP!

This might be a problem with Arrow's file path handling. You could instead pass in an already opened file: <pre class="prettyprint"><code>import pandas as pd with open(r'E:\datasets\proj\train\train.parquet', 'rb') as f: df = pd.read_parquet(f, engine='pyarrow') </code></pre>

Unable to read a parquet file

1 Answers

This might be a problem with Arrow's file path handling. You could instead pass in an already opened file:

import pandas as pd

with open(r'E:\datasets\proj\train\train.parquet', 'rb') as f:
    df = pd.read_parquet(f, engine='pyarrow')

117

answered Oct 09 '22 01:10

Uwe L. Korn

Related questions
                            
                                Defining python type hints for list of a weakref object
                            
                                Testing Flask Sessions with Pytest
                            
                                Async for loop on AsyncGenerator
                            
                                Multiple Async Context Managers
                            
                                Upload file from URL to Microsoft Azure Blob Storage
                            
                                Difference between ax.set_xlabel() and ax.xaxis.set_label() in MatplotLib 3.0.1
                            
                                How to get the current QApplication?
                            
                                AWS Glue and update duplicating data
                            
                                Add axhline to legend
                            
                                Deblurring an image
                            
                                How can you create a KDE from histogram values only?
                            
                                How to send an PIL Image via telegram bot without saving it to a file
                            
                                How to get the text out of a scrolledtext widget?
                            
                                Moving a Sprite towards player in Pygame (using pygame vectors)
                            
                                Fill order from smaller packages?
                            
                                Test whether list A is contained in list B
                            
                                How to find all node's ancestors in NetworkX?
                            
                                Build graph of organizational structure
                            
                                Why should we use re.purge() in python regular expression?
                            
                                How do you add GeoJsonTooltip to folium.Choropleth class in folium?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Unable to read a parquet file

Tags:

python

pandas

parquet

pyarrow

fastparquet

Anonymous Person

People also ask

1 Answers

Uwe L. Korn

Recent Activity

Donate For Us