Is there a way for read_csv to auto-detect the delimiter? numpy's genfromtxt does this. My files have data with single space, double space and a tab as delimiters. genfromtext() solves it, but is slower than pandas' read_csv. Any ideas?
Here are the steps you should follow: Open your CSV using a text editor. Skip a line at the top, and add sep=; if the separator used in the CSV is a semicolon (;), or sep=, if the separator is a comma (,). Save, and re-open the file.
Indicate separator directly in CSV file For this, open your file in any text editor, say Notepad, and type the below string before any other data: To separate values with comma: sep=, To separate values with semicolon: sep=; To separate values with a pipe: sep=|
Option 1
Using delim_whitespace=True
df = pd.read_csv('file.csv', delim_whitespace=True)
Option 2
Pass a regular expression to the sep
parameter:
df = pd.read_csv('file.csv', sep='\s+')
This is equivalent to the first option
Documentation for pd.read_csv
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With