How can I extract the Unicode code point(s) of a given Character
without first converting it to a String
? I know that I can use the following:
let ch: Character = "A" let s = String(ch).unicodeScalars s[s.startIndex].value // returns 65
but it seems like there should be a more direct way to accomplish this using just Swift's standard library. The Language Guide sections "Working with Characters" and "Unicode" only discuss iterating through the characters in a String
, not working directly with Character
s.
From what I can gather in the documentation, they want you to get Character
values from a String
because it gives context. Is this Character
encoded with UTF8, UTF16, or 21-bit code points (scalars)?
If you look at how a Character
is defined in the Swift framework, it is actually an enum
value. This is probably done due to the various representations from String.utf8
, String.utf16
, and String.unicodeScalars
.
It seems they do not expect you to work with Character
values but rather Strings
and you as the programmer decide how to get these from the String
itself, allowing encoding to be preserved.
That said, if you need to get the code points in a concise manner, I would recommend an extension like such:
extension Character { func unicodeScalarCodePoint() -> UInt32 { let characterString = String(self) let scalars = characterString.unicodeScalars return scalars[scalars.startIndex].value } }
Then you can use it like so:
let char : Character = "A" char.unicodeScalarCodePoint()
In summary, string and character encoding is a tricky thing when you factor in all the possibilities. In order to allow each possibility to be represented, they went with this scheme.
Also remember this is a 1.0 release, I'm sure they will expand Swift's syntactical sugar soon.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With