Since the columns and list of usecols are different, it spits the error
"ValueError" Usecols do not match names.
How can I 'usecol' if columns exist in csv?
csv sample:
df.csv
AB,CD,EF,GH
foo,20160101,a,1
foo,20160102,a,3
foo,20160103,a,5
reading csv:
import pandas as pd
df = pd.read_csv('df.csv',
header=0,usecols=["AB", "CD", "IJ"])
This is what I'd like to get:
df
date AB CD
2016-01-01 a 1
2016-01-02 a 3
2016-01-03 a 5
Ignored "IJ".
Use lambda in usecols to skip columns that not in csv:
import pandas as pd
from io import StringIO
txt = """AB,CD,EF,GH
foo,20160101,a,1
foo,20160102,a,3
foo,20160103,a,5"""
usecols = ['AB', 'CD', 'IJ']
df = pd.read_csv(StringIO(txt), usecols=lambda c: c in set(usecols))
print(df)
AB CD
0 foo 20160101
1 foo 20160102
2 foo 20160103
An explanation can be found in the pandas docs:
https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
If callable, the callable function will be evaluated against the column names, returning names where the callable function evaluates to True. An example of a valid callable argument would be lambda x: x.upper() in ['AAA', 'BBB', 'DDD']. Using this parameter results in much faster parsing time and lower memory usage.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With