I have this code in JAVA and works fine
String a = "ABC";
System.out.println(a.length());
for (int n = 0; n < a.length(); n++)
System.out.println(a.codePointAt(n));
The output as expected is 3 65 66 67 I am a little confused aboud a.length() because it is suposed to return the length in chars but String must store every < 256 char in 16 bits or whatever a unicode character would need.
But the question is how can i do the same i C#?. I need to scan a string and act depending on some unicode characters found.
The real code I need to translate is
String str = this.getString();
int cp;
boolean escaping = false;
for (int n = 0; n < len; n++)
{
//===================================================
cp = str.codePointAt(n); //LOOKING FOR SOME EQUIVALENT IN C#
//===================================================
if (!escaping)
{
....
//Closing all braces below.
Thanks in advance.
How much i love JAVA :). Just need to deliver a Win APP that is a cliend of a Java / Linux app server.
The exact translation would be this :
string a = "ABC⤶"; //Let's throw in a rare unicode char
Console.WriteLine(a.Length);
for (int n = 0; n < a.Length; n++)
Console.WriteLine((int)a[n]); //a[n] returns a char, which we can cast in an integer
//final result : 4 65 66 68 10550
In C# you don't need codePointAt
at all, you can get the unicode number directly by casting the character into an int
(or for an assignation, it's casted implicitly). So you can get your cp simply by doing
cp = (int)str[n];
How much I love C# :)
However, this is valid only for low Unicode values. Surrogate pairs are handled as two different characters when you break the string down, so they won't be printed as one value. If you really need to handle UTF32, you can refer to this answer, which basically uses
int cp = Char.ConvertToUtf32(a, n);
after incrementing the loop by two (because it's coded on two chars), with the Char.IsSurrogatePair()
condition.
Your translation would then become
string a = "ABC\U0001F01C";
Console.WriteLine(s.Count(x => !char.IsHighSurrogate(x)));
for (var i = 0; i < a.Length; i += char.IsSurrogatePair(a, i) ? 2 : 1)
Console.WriteLine(char.ConvertToUtf32(a, i));
Please note the change from s.Length()
to a little bit of LINQ for the count, because surrogates are counted as two chars. We simply count how many characters are not higher surrogates to get the clear count of actual characters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With