Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Customizing the separator in pandas read_csv

Tags:

I am reading many different data files into various pandas dataframes. The columns in these datafiles are separated by spaces. However, for each file, the number of spaces is different (for some of them, there is only one space, for others, there are two spaces and so on). Thus, every time I import the file, I have to manually go to that file and see the number of spaces that have been used and give those many number of spaces in sep:

import pandas as pd df = pd.read_csv('myfile.dat', sep = '    ') 

Is there any way I can tell pandas to assume "any number of spaces" as the separator? Also, is there any way I can tell pandas to use either tab (\t) or spaces as the separator?

like image 709
Peaceful Avatar asked Dec 20 '16 04:12

Peaceful


People also ask

How do I add a separator in pandas?

We use the python string format syntax '{:,. 0f}'. format to add the thousand comma separators to the numbers. Then we use python's map() function to iterate and apply the formatting to all the rows in the 'Median Sales Price' column.

Which argument do you specify with read_csv () to specify a separator character?

read_csv() we have to pass the sep & engine arguments to pandas. read_csv() i.e. Here, sep argument will be used as separator or delimiter. If sep argument is not specified then default engine for parsing ( C Engine) will be used which uses ',' as delimiter.


1 Answers

Yes, you can use a simple regular expression like sep='\s+' to denote one or more spaces.

like image 82
Ted Petrou Avatar answered Sep 17 '22 02:09

Ted Petrou