I'm reading .txt files in a directory and want to drop columns that contains some certain string.
for file in glob.iglob(files + '.txt', recursive=True):
cols = list(pd.read_csv(file, nrows =1))
df=pd.read_csv(file,header=0, skiprows=0, skipfooter=0, usecols =[i for i in cols if i.str.contains['TRIVIAL|EASY']==False])
when I do this I'm getting
df=pd.read_csv(file,header=0, skiprows=0, skipfooter=0, usecols =[i for i >in cols if i.str.contains['PASS']==True])
AttributeError: 'str' object has no attribute 'str'
Which part I need tp fix I could not figured it out ?
select columns based on columns names containing a specific string in pandas
drop column based on a string condition
AttributeError: 'str' object has no attribute 'str'
Drop multiple columns that end with certain string in Pandas
Without reading the header separately you would pass a callable to usecols
. Check whether 'EASY'
or 'TRIVIAL'
are not in the column name.
exclu = ['EASY', 'TRIVIAL'] # Any substring in this list excludes a column
usecols = lambda x: not any(substr in x for substr in exclu)
df = pd.read_csv('test.csv', usecols=usecols)
print(df)
HARD MEDIUM
0 2 4
1 6 8
2 1 1
test.csv
TRIVIAL,HARD,EASYfoo,MEDIUM
1,2,3,4
5,6,7,8
1,1,1,1
few issues in your code, first you are using str.contains
on the whole dataframe not the columns, secondly the str contains cannot be used on a list.
using regex
import re
cols = pd.read_csv(file, nrows =1)
cols_to_use = [i for i in cols.columns if not re.search('TRIVIAL|EASY',i)]
df=pd.read_csv(file,header=0, skiprows=0, skipfooter=0, usecols =cols_to_use)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With