I would like to use the collections.Counter class to count emojis in a string. It generally works fine, however, when I introduce colored emojis the color component of the emoji is separated from the emoji like so:
>>> import collections
>>> emoji_string = "ππ»ππΌππ½ππΎππΏ"
>>> emoji_counter = collections.Counter(emoji_string)
>>> emoji_counter.most_common()
[('π', 5), ('π»', 1), ('πΌ', 1), ('π½', 1), ('πΎ', 1), ('πΏ', 1)]
How can I make the most_common() function return something like this instead:
[('ππ»', 1), ('ππΌ', 1), ('ππ½', 1), ('ππΎ', 1), ('ππΏ', 1)]
I'm using Python 3.6
To get the count of an element using Counter you can do as follows: from collections import Counter counter1 = Counter ({'x': 5, 'y': 12, 'z': -2, 'x1':0}) print (counter1 ['y']) # this will give you the count of element 'y'
that you can use to store information in memory. This article will be about the Counter object. A Counter is a container that tracks how many times equivalent values are added. use bag or multiset data structures.
There is no built-in function to count colored cells in excel, but below mentioned are three different methods to do this task. For this example, look at the below data. As we can see, each city is marked with different colors. So we need to count the number of cities based on cell color. Follow the below steps to count cells by color.
Here, are major reasons for using Python 3 Counter: The Counter holds the data in an unordered collection, just like hashtable objects. The elements here represent the keys and the count as values.
You'll have to split your string into separate clusters. Each of your emoji is really two codepoints; the emoji and a EMOJI MODIFIER FITZPATRICK TYPE X codepoint:
>>> print(emoji_string[0])
π
>>> print(emoji_string[1])
π»
>>> print(emoji_string[:2])
ππ»
>>> print(ascii(emoji_string[:2]))
'\U0001f44c\U0001f3fb'
>>> import unicodedata
>>> unicodedata.name(emoji_string[1])
'EMOJI MODIFIER FITZPATRICK TYPE-1-2'
You could use a regular expression to keep those with the preceding emoji:
import re
char_with_modifier = re.compile(r'(.[\U0001f3fb-\U0001f3ff]?)')
split_emoji = char_with_modifier.findall(emoji_string)
and count the result.
Demo:
>>> import re
>>> from collections import Counter
>>> emoji_string = "ππ»ππΌππ½ππΎππΏ"
>>> char_with_modifier = re.compile(r'(.[\U0001f3fb-\U0001f3ff]?)')
>>> Counter(char_with_modifier.findall(emoji_string))
Counter({'ππ»': 1, 'ππΌ': 1, 'ππ½': 1, 'ππΎ': 1, 'ππΏ': 1})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With