Can I convert directly between a Swift Character and its Unicode numeric value? That is:
var i:Int = ... // A plain integer index.
var myCodeUnit:UInt16 = myString.utf16[i]
// Would like to say myChar = myCodeUnit as Character, or equivalent.
or...
var j:String.Index = ... // NOT an integer!
var myChar:Character = myString[j]
// Would like to say myCodeUnit = myChar as UInt16
I can say:
myCodeUnit = String(myChar).utf16[0]
but this means creating a new String for each character. And I am doing this thousands of times (parsing text) so that is a lot of new Strings that are immediately being discarded.
The type Character
represents a "Unicode grapheme cluster", which can be multiple Unicode codepoints. If you want one Unicode codepoint, you should use the type UnicodeScalar
instead.
As per the swift book:
String to Code Unit
To get codeunit/ordinals for each character of the String, you can do the following:
var yourSwiftString = "甲乙丙丁"
for scalar in yourSwiftString.unicodeScalars {
print("\(scalar.value) ")
}
Code Unit to String
Because swift current does not have a way to convert ordinals/code units back to UTF, the best way I found is to still NSString. i.e. if you have int ordinals (32bit but representing the 21bit codepoints) you can use the following to convert to Unicode:
var i = 22247
var unicode_str = NSString(bytes: &i, length: 4, encoding: NSUTF32LittleEndianStringEncoding)
Obviously if you want to convert a array of ints, you'll need to pack them into a array first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With