Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Counting Character Occurrences

Let's say I have a dataframe that looks like this:

df2 = pd.DataFrame(['2018/10/02, 10/2', '02/20/18', '10-31/2018', '1111-0-1000000', '2018/10/11/2019/9999', '10-2, 11/2018/01', '10/2'], columns=['A'])

>>> df2

                      A
0      2018/10/02, 10/2
1              02/20/18
2            10-31/2018
3        1111-0-1000000
4  2018/10/11/2019/9999
5      10-2, 11/2018/01
6                  10/2

Is their a way to count the number of occurrences of a specific character or set of characters?

i.e. I want to count the number of "-" and "/" and add them together, so my output my look like this:

                      A     specific_character_count
0      2018/10/02, 10/2                            3
1              02/20/18                            2
2            10-31/2018                            2
3        1111-0-1000000                            2
4  2018/10/11/2019/9999                            4
5      10-2, 11/2018/01                            3
6                  10/2                            1
like image 289
Chicken Sandwich No Pickles Avatar asked Aug 24 '18 19:08

Chicken Sandwich No Pickles


1 Answers

Pass a regular expression to str.count (| is used for or):

df2['A'].str.count('/|-')
Out: 
0    3
1    2
2    2
3    2
4    4
5    3
6    1
Name: A, dtype: int64
like image 61
ayhan Avatar answered Sep 30 '22 20:09

ayhan