Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count the number of upper case words more than 1 character long in series

I have a series of strings and I'm trying to create a new column that counts the number of upper case words in each string, with the constraint that the word is greater than 1. For example, the series

s = pd.Series(['I AM MAD!', 'Today is a nice day', 'This restaurant SUCKS'])

would return a series with values of 2, 0, 1.

A few other helpful questions on here have shown me one way to do this for a single string:

sum(map(str.isupper, [word for word in s[0].split() if len(word) > 1]))

which correctly returns 2.

But I'm wondering how to apply this to the entire series without looping over each element?

like image 803
kcm2174 Avatar asked Apr 09 '20 15:04

kcm2174


People also ask

How do you count uppercase in Python?

Initialize the two count variables to 0. 3. Use a for loop to traverse through the characters in the string and increment the first count variable each time a lowercase character is encountered and increment the second count variable each time a uppercase character is encountered.

How do you calculate upper and lower case in Java?

To check whether a character is in Uppercase or not in Java, use the Character. isUpperCase() method.


2 Answers

You can use regex to extract the words, and then count:

(s.str.extractall(r'(\b[A-Z]{2,}\b)')  # extract all capitalized words with len at least 2
  .groupby(level=0).size()             # count by each index
  .reindex(s.index, fill_value=0)      # fill the missing with 0
)

Output:

0    2
1    0
2    1
dtype: int64
like image 143
Quang Hoang Avatar answered Nov 14 '22 20:11

Quang Hoang


Borrow Quang's regex

s.str.count(r'(\b[A-Z]{2,}\b)')
0    2
1    0
2    1
dtype: int64
like image 23
BENY Avatar answered Nov 14 '22 18:11

BENY