I'm trying to filter non-alphabetical characters out of a String, but running into the issue that CharacterSet
uses Unicode.Scalar
and String consists of Character
.
Xcode gives the error:
Cannot convert value of type 'String.Element' (aka 'Character') to specified type 'Unicode.Scalar?'
let name = "name"
let allowedCharacters = CharacterSet.alphanumerics
let filteredName = name.filter { (c) -> Bool in
if let s: Unicode.Scalar = c { // cannot convert
return !allowedCharacters.contains(s)
}
return true
}
The Unicode. Scalar type, representing a single Unicode scalar value, is the element type of a string's unicodeScalars collection. You can create a Unicode. Scalar instance by using a string literal that contains a single character representing exactly one Unicode scalar value.
Unicode scalar values are the 21-bit codes that are the basic unit of Unicode. Each scalar value is represented by a Unicode. Scalar instance and is equivalent to a UTF-32 code unit. Some characters that are visible in a string are made up of more than one Unicode scalar value.
Swift CharacterCharacter is a data type that represents a single-character string ( "a" , "@" , "5" , etc). Here, the letter variable can only store single-character data.
CharacterSet
has an unfortunate name inherited from Objective C. In reality, it is a set of Unicode.Scalar
s, not of Characters
(“extended grapheme clusters” in Unicode parlance). This is necessary, because while there is a finite set of Unicode scalars, there is an infinite number of possible grapheme clusters. For example, e + ◌̄ + ◌̄ + ◌̄ ...
ad infinitum is still just one cluster. As such, it is impossible to exhaustively list all possible clusters, and it is often impossible to list the subset of them that has a particular property. Set operations such as those in the question must use scalars instead (or at least use definitions derived from the component scalars).
In Swift, String
s have a unicodeScalars
property for operating on the string a the scalar level, and the property is directly mutable. That enables you to do things like this:
// Assuming...
var name: String = "..."
// ...then...
name.unicodeScalars.removeAll(where: { !CharacterSet.alphanumerics.contains($0) })
A single Character
can consist of several UnicodeScalar
s, so you need to iterate through all of them and check if they are contained in CharacterSet.alphanumerics
.
let allowedCharacters = CharacterSet.alphanumerics
let filteredName = name.filter { (c) -> Bool in
return !c.unicodeScalars.contains(where: { !allowedCharacters.contains($0)})
}
Test input: let name = "asd😊1"
Test output: "asd1"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With