I'll try to use a simple example to describe my problem.
I have a csv file with many columns. One of this columns' header is "names".
In this column "names" I need only the times the name "John" is repeated.
As an example, my column "names" is as follows:
names
John
John M
Mike John
Audrey
Andrew
For this case I would need a python script using pandas to get the value of 3 because the word 'John' is repeated three times.
These are the codes I am using:
from_csv = pd.read_csv(r'csv.csv', usecols = ['names'] , index_col=0, header=0 )
times = from_csv.query('names == "John"').names.count()
But it only returns me 1, because there is only one row that has only John.
I have tried using:
times = from_csv.query('names == "*John*"').names.count()
but no success.
How can I get the 3 for this particular situation? thanks
Using str.contains
df.Name.str.contains('John').sum()
Out[246]: 3
Or we using list and map with in
sum(list(map(lambda x : 'John' in x,df.Name)))
Out[248]: 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With