This is working well for English:
public static func posOf(needle: String, haystack: String) -> Int {
return haystack.distance(from: haystack.startIndex, to: (haystack.range(of: needle)?.lowerBound)!)
}
But for foreign characters the returned value is always too small. For example "का" is considered one unit instead of 2.
posOf(needle: "काम", haystack: "वह बीना की खुली कोयला खदान में काम करता था।") // 21
I later use the 21 in NSRange(location:length:)
where it needs to be 28 to make NSRange
work properly.
A Swift String
is a collection of Character
s, and each Character
represents an "extended Unicode grapheme cluster".
NSString
is a collection of UTF-16 code units.
Example:
print("का".characters.count) // 1
print(("का" as NSString).length) // 2
Swift String
ranges are represented as Range<String.Index>
,
and NSString
ranges are represented as NSRange
.
Your function counts the number of Character
s from the start
of the haystack to the start of the needle, and that is different
from the number of UTF-16 code points.
If you need a "NSRange compatible"
character count then the easiest method would be use the
range(of:)
method of NSString
:
let haystack = "वह बीना की खुली कोयला खदान में काम करता था।"
let needle = "काम"
if let range = haystack.range(of: needle) {
let pos = haystack.distance(from: haystack.startIndex, to: range.lowerBound)
print(pos) // 21
}
let nsRange = (haystack as NSString).range(of: needle)
if nsRange.location != NSNotFound {
print(nsRange.location) // 31
}
Alternatively, use the utf16
view of the Swift string to
count UTF-16 code units:
if let range = haystack.range(of: needle) {
let lower16 = range.lowerBound.samePosition(in: haystack.utf16)
let pos = haystack.utf16.distance(from: haystack.utf16.startIndex, to: lower16)
print(pos) // 31
}
(See for example
NSRange to Range<String.Index> for more methods to convert between Range<String.Index>
and NSRange
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With