Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Processing Emojis in SQLite

I am hoping to identify which emojis are used most in a text conversation using SQL Lite. I am using DB Browser and the emojis show up like they do in iMessage (see below picture), but I am stumped on how to count them.

I was thinking if there was a way to check and see if a character is not a letter/number/punctuation, then I could count the frequency of all characters that don't fit the prerequisite list. That said, I am unfamiliar with SQLite commands and how I can accomplish that.

Is there a better way to go about this? Let me know if you need more context to answer this question.

Emoji Example

like image 824
Dom Vito Avatar asked May 04 '26 11:05

Dom Vito


1 Answers

The only way I can see to do this with SQLite directly would be to compile SQLite from the source code so you could add support for regex_replace.

However, you only plan to do it once, and recompiling SQLite might be a bit overkill.

Instead, you could copy your text column into a plain text file, and run the following command:

sed 's/\(.\)/\1\n/g' temp.txt | sed 's/[[:alnum:].-]//g' | sort -r | uniq -c

This would turn the following:

Hello! Are you stuck? šŸ¤” I saw 🐻🐻🐻 in the park!!!!! šŸŽ‚šŸŽ‚šŸŽ‚šŸŽ‚šŸŽ‚šŸŽ‚ - all lies. Easy as 123! šŸ˜ŽšŸ˜ŽšŸ˜ŽšŸ˜ŽšŸ˜ŽšŸ˜ŽšŸ˜ŽšŸ˜ŽšŸ˜ŽšŸ˜ŽšŸ˜Ž

into:

  1 šŸ¤”
 11 šŸ˜Ž
  3 🐻
  6 šŸŽ‚
  1 ?
  7 !
 17
 50

Which would hopefully be close enough to get you to your goal. The last two entries are for tabs and spaces.

sed is a linux command, so if you are running windows you may want to get a windows version here: https://github.com/mbuilov/sed-windows

like image 141
paul Avatar answered May 06 '26 17:05

paul



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!