I am trying to figure out how to check if a string contains a specfic emoji. For example, look at the following two emoji:
Bicyclist: http://unicode.org/emoji/charts/full-emoji-list.html#1f6b4
US Flag: http://unicode.org/emoji/charts/full-emoji-list.html#1f1fa_1f1f8
Bicyclist is U+1F6B4
, and the US flag is U+1F1FA U+1F1F8
.
However, the emoji to check for are provided to me in an array like this, with just the numerical value in strings:
var checkFor = new string[] {"1F6B4","1F1FA-1F1F8"};
How can I convert those array values into actual unicode characters and check to see if a string contains them?
I can get something working for the Bicyclist, but for the US flag I'm stumped.
For the Bicyclist, I'm doing the following:
const string comparisonStr = "..."; //some string containing text and emoji
var hexVal = Convert.ToInt32(checkFor[0], 16);
var strVal = Char.ConvertFromUtf32(hexVal);
//now I can successfully do the following check
var exists = comparisonStr.Contains(strVal);
But this will not work with the US Flag because of the multiple code points.
The compiled program simply outputs that byte sequence, no different from the Hello World program. The terminal then takes car If you get your C code to print in the Unicode character set, there are emojis defined in it. That would probably be the simplest way to do so. Or, use ASCII emojis like we did “in ye olde days”. :-)
Case 3: Else if first byte value >= C0 (Hex) or 192 (Decimal) or 11000000 (Binary), the Unicode character is placed in 2 bytes. Case 4: Else, in rest of the cases, the Unicode character is placed in 1 byte.
Emoji is a small digital image or icon used to express an idea or emotion. These are small enough to insert into the text. In Japanese “e” means picture and “moji” means character.
Both emoji and emoticon convey emotional expression in a text message for text analysis we might need to handle it carefully. We can handle these in two ways- 1.By removing these from the texts. Removing the emojis/emoticons from the text for text analysis might not be a good decision.
You already got past the hard part. All you were missing is parsing the value in the array, and combining the 2 unicode characters before performing the check.
Here is a sample program that should work:
static void Main(string[] args)
{
const string comparisonStr = "bicyclist: \U0001F6B4, and US flag: \U0001F1FA\U0001F1F8"; //some string containing text and emoji
var checkFor = new string[] { "1F6B4", "1F1FA-1F1F8" };
foreach (var searchStringInHex in checkFor)
{
string searchString = string.Join(string.Empty, searchStringInHex.Split('-')
.Select(hex => char.ConvertFromUtf32(Convert.ToInt32(hex, 16))));
if (comparisonStr.Contains(searchString))
{
Console.WriteLine($"Found {searchStringInHex}!");
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With