Reliable function to get position of substring in string in Swift

Question

This is working well for English:

public static func posOf(needle: String, haystack: String) -> Int {
    return haystack.distance(from: haystack.startIndex, to: (haystack.range(of: needle)?.lowerBound)!)
}

But for foreign characters the returned value is always too small. For example "का" is considered one unit instead of 2.

posOf(needle: "काम", haystack: "वह बीना की खुली कोयला खदान में काम करता था।") // 21

I later use the 21 in NSRange(location:length:) where it needs to be 28 to make NSRange work properly.

Martin R · Accepted Answer

A Swift String is a collection of Characters, and each Character represents an "extended Unicode grapheme cluster".

NSString is a collection of UTF-16 code units.

Example:

print("का".characters.count) // 1
print(("का" as NSString).length) // 2

Swift String ranges are represented as Range<String.Index>, and NSString ranges are represented as NSRange.

Your function counts the number of Characters from the start of the haystack to the start of the needle, and that is different from the number of UTF-16 code points.

If you need a "NSRange compatible" character count then the easiest method would be use the range(of:) method of NSString:

let haystack = "वह बीना की खुली कोयला खदान में काम करता था।"
let needle = "काम"

if let range = haystack.range(of: needle) {
    let pos = haystack.distance(from: haystack.startIndex, to: range.lowerBound)
    print(pos) // 21
}

let nsRange = (haystack as NSString).range(of: needle)
if nsRange.location != NSNotFound {
    print(nsRange.location) // 31
}

Alternatively, use the utf16 view of the Swift string to count UTF-16 code units:

if let range = haystack.range(of: needle) {
    let lower16 = range.lowerBound.samePosition(in: haystack.utf16)
    let pos = haystack.utf16.distance(from: haystack.utf16.startIndex, to: lower16)
    print(pos) // 31
}

(See for example NSRange to Range<String.Index> for more methods to convert between Range<String.Index> and NSRange).

Reliable function to get position of substring in string in Swift

Tags:

string

encoding

swift

twharmon

1 Answers

Martin R

Recent Activity

Donate For Us

Reliable function to get position of substring in string in Swift

Tags:

string

encoding

swift

twharmon

1 Answers

Martin R

Related questions

Recent Activity

Donate For Us