I would like to read sample csv file shown in below
--------------
|A|B|C|
--------------
|1|2|3|
--------------
|4|5|6|
--------------
|7|8|9|
--------------
I tried
pd.read_csv("sample.csv",sep="|")
But it didn't work well.
How can I read this csv?
You can add parameter comment
to read_csv
and then remove columns with NaN
by dropna
:
import pandas as pd
import io
temp=u"""--------------
|A|B|C|
--------------
|1|2|3|
--------------
|4|5|6|
--------------
|7|8|9|
--------------"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep="|", comment='-').dropna(axis=1, how='all')
print (df)
A B C
0 1 2 3
1 4 5 6
2 7 8 9
More general solution:
import pandas as pd
import io
temp=u"""--------------
|A|B|C|
--------------
|1|2|3|
--------------
|4|5|6|
--------------
|7|8|9|
--------------"""
#after testing replace io.StringIO(temp) to filename
#separator is char which is NOT in csv
df = pd.read_csv(io.StringIO(temp), sep="^", comment='-')
#remove first and last | in data and in column names
df.iloc[:,0] = df.iloc[:,0].str.strip('|')
df.columns = df.columns.str.strip('|')
#split column names
cols = df.columns.str.split('|')[0]
#split data
df = df.iloc[:,0].str.split('|', expand=True)
df.columns = cols
print (df)
A B C
0 1 2 3
1 4 5 6
2 7 8 9
Try "import csv" rather than directly use pandas.
import csv
easy_csv = []
with open('sample.csv', 'rb') as csvfile:
test = csv.reader(csvfile, delimiter=' ', quotechar='|')
for row in test:
row_preprocessed = """ handling rows at here; removing |, ignoring row that has ----"""
easy_csv.append([row_preprocessed])
After this preprocessing, you can save it into comma separated csv files to easily handle on pandas.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With