Reading data (just 20000 numbers) from a xlsx file takes forever: <pre class="prettyprint"><code>import pandas as pd xlsxfile = pd.ExcelFile("myfile.xlsx") data = xlsxfile.parse('Sheet1', index_col = None, header = None) </code></pre> takes about 9 seconds. If I save the same file in csv format it takes ~25ms: <pre class="prettyprint"><code>import pandas as pd csvfile = "myfile.csv" data = pd.read_csv(csvfile, index_col = None, header = None) </code></pre> Is this an issue of openpyxl or am I missing something? Are there any alternatives?

xlrd has support for .xlsx files, and this answer suggests that at least the beta version of xlrd with .xlsx support was quicker than openpyxl. The current stable version of Pandas (11.0) uses openpyxl for .xlsx files, but this has been changed for the next release. If you want to give it a go, you can download the dev version from GitHub

csv & xlsx files import to pandas data frame: speed issue

Tags:

python

pandas

csv

xlsx

openpyxl

Reading data (just 20000 numbers) from a xlsx file takes forever:

import pandas as pd
xlsxfile = pd.ExcelFile("myfile.xlsx")
data = xlsxfile.parse('Sheet1', index_col = None, header = None)

takes about 9 seconds.

If I save the same file in csv format it takes ~25ms:

import pandas as pd
csvfile = "myfile.csv"
data = pd.read_csv(csvfile, index_col = None, header = None)

Is this an issue of openpyxl or am I missing something? Are there any alternatives?

992

asked Apr 24 '13 03:04

sashkello

1 Answers

xlrd has support for .xlsx files, and this answer suggests that at least the beta version of xlrd with .xlsx support was quicker than openpyxl.

The current stable version of Pandas (11.0) uses openpyxl for .xlsx files, but this has been changed for the next release. If you want to give it a go, you can download the dev version from GitHub

120

answered Sep 30 '22 14:09

Matti John

Related questions
                            
                                Transforming financial data from postgres to pandas dataframe for use with Zipline
                            
                                Broadcast a message to all available machines on WiFi
                            
                                Inconsistent behavior with HTTP POST requests in Python
                            
                                wait for shutil.copyfile to finish
                            
                                How do I set up a Selenium Grid Python test case to test across multiple machines?
                            
                                Django Tastypie slow POST response
                            
                                Is there any good assembly generation module for Python?
                            
                                Adding a very repetitive matrix to a sparse one in numpy/scipy?
                            
                                Numpy array interface with ctypes function
                            
                                Testing Django Responses to Stripe Webhooks
                            
                                Sqlalchemy: joinedload + limit
                            
                                SyntaxError: Non-UTF-8 code starting with '\x82'
                            
                                Is it possible to upgrade a Python package on the fly?
                            
                                Install transitive bitbucket dependencies via pip
                            
                                Add a datafile type reader to paraview using pvpython
                            
                                Python implementation for Stop and Wait Algorithm
                            
                                python pexpect: SSHing then updating the date
                            
                                Why does sympy override `__new__` instead of `__init__`?
                            
                                How do I use test resources (like a fixed yaml file) with pytest?
                            
                                How to prebuffer an incoming network stream with gstreamer?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With