Auto-detect the delimiter in a CSV file using pd.read_csv

Tags:

Is there a way for read_csv to auto-detect the delimiter? numpy's genfromtxt does this. My files have data with single space, double space and a tab as delimiters. genfromtext() solves it, but is slower than pandas' read_csv. Any ideas?

236

asked Sep 09 '17 22:09

SEU

1 Answers

Option 1

Using delim_whitespace=True

df = pd.read_csv('file.csv', delim_whitespace=True)

Option 2

Pass a regular expression to the sep parameter:

df = pd.read_csv('file.csv', sep='\s+')

This is equivalent to the first option

Documentation for pd.read_csv.

143

answered Oct 05 '22 00:10

cs95

Related questions
                            
                                Using the Django ORM, How can you create a unique hash for all possible combinations
                            
                                url_for with _external=True on heroku doesn't append the server name on the URL
                            
                                Why does the call method gets called at build time in Keras layers
                            
                                Colorbar for each row in ImageGrid
                            
                                Unit testing celery tasks directly
                            
                                DeprecationWarning: Non-string object detected for the array ordering. Please pass in 'C', 'F', 'A', or 'K' instead
                            
                                How to achieve TestNG like feature in Python Selenium or add multiple unit test in one test suite?
                            
                                How to share the same instance for all methods of a pytest test class
                            
                                How to protect Flask-RESTful with Flask-USER management?
                            
                                How to create a Git Pull Request in GitPython
                            
                                Python H2O Memory Management
                            
                                Fastest way to solve least square for overdetermined system
                            
                                How to create a msi by using cx_freeze which will accept command line input
                            
                                Adding Chart.js line chart to Jinja2/Flask html page from JS file
                            
                                Python: Identifying undulating patterns in 1d distribution
                            
                                Is there a way to print a short version of the docstring for all members of a Python object?
                            
                                PYMC3 Bayesian Prediction Cones
                            
                                KeyError when extracting data from a pandas.core.series.Series
                            
                                Steam API get historical player count of specific game
                            
                                Retrain InceptionV4's Final Layer for New Categories: local variable not initialized

Auto-detect the delimiter in a CSV file using pd.read_csv

Tags:

python

pandas

csv

delimiter

SEU

People also ask

1 Answers

cs95

Recent Activity

Donate For Us