The string that I want to convert into character array is ষ্টোর it is in Unicode and a Bengali word.
The problem is when I am converting it in Visual studio then it is returning 6 characters but when I am converting it in Android Studio then it is showing 5 characters.
In VS I am using char[] arrayOfChars = someString.ToCharArray(); and in Android Studio char[] arrayOfChars = someString.toCharArray();
N:B: My Android Studio IDE and Project Encoding is UTF-8. I am expecting same result as Visual Studio in Android Studio.
A single string that can be referenced from the application or from other resource files (such as an XML layout). Note: A string is a simple resource that is referenced using the value provided in the name attribute (not the name of the XML file).
In C#, ToCharArray() is a string method. This method is used to copy the characters from a specified string in the current instance to a Unicode character array or the characters of a specified substring in the current instance to a Unicode character array.
How can I write character & in the strings. xml? In android studio, you can simply press Alt+Enter and it will convert for you.
Those two arrays are unicode equivalent, but are being represented by different normalization forms. What seems to be happening is that the Java ToCharArray
(or string representation) is using one normalization form, while the C# ToCharArray
(or string representation) is using another.
This page contains a chart of different normalization forms for Bengali text - the fourth row there describes exactly what you're seeing:
I am only learning about this now, but it seems to me that the motivation for this is so that unicode implementations could remain compatible with pre-existing encodings wherever possible and practical.
For example, one pre-existing encoding may have used a single unicode character, while another pre-existing encoding may have instead used two characters combined. The solution settled on by the unicode folks is thus to support both, at the cost of not having a single "canonical" representation, as you've encountered here.
If you wish for your Java array to be normalized under the "D" normalization form that your C# array seems to be using, it appears that this page provides such a function. You may be looking for something like:
someString = Normalizer.normalize(someString, Normalizer.Form.NFD);
Unicode standard annex 15 is the official document that describes these normalization forms.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With