What I want is something like
"word1 word2 word3".rangeOfWord(2) => 6 to 10
The result could come as a Range or a tuple or whatever.
I'd rather not do the brute force of iterating over the characters and using a state machine. Why reinvent the lexer? Is there a better way?
In your example, your words are unique, and you can use the following method:
let myString = "word1 word2 word3"
let wordNum = 2
let myRange = myString.rangeOfString(myString.componentsSeparatedByString(" ")[wordNum-1])
// 6..<11
As pointed out by Andrew Duncan in the comments below, the above is only valid if your words are unique. If you have non-unique words, you can use this somewhat less neater method:
let myString = "word1 word2 word3 word2 word1 word3 word1"
let wordNum = 7 // 2nd instance (out of 3) of "word1"
let arr = myString.componentsSeparatedByString(" ")
var fromIndex = arr[0..<wordNum-1].map { $0.characters.count }.reduce(0, combine: +) + wordNum - 1
let myRange = Range<String.Index>(start: myString.startIndex.advancedBy(fromIndex), end: myString.startIndex.advancedBy(fromIndex+arr[wordNum-1].characters.count))
let myWord = myString.substringWithRange(myRange)
// string "word1" (from range 36..<41)
Finally, lets use the latter to construct an extension of String
as you have wished for in your question example:
extension String {
private func rangeOfNthWord(wordNum: Int, wordSeparator: String) -> Range<String.Index>? {
let arr = myString.componentsSeparatedByString(wordSeparator)
if arr.count < wordNum {
return nil
}
else {
let fromIndex = arr[0..<wordNum-1].map { $0.characters.count }.reduce(0, combine: +) + (wordNum - 1)*wordSeparator.characters.count
return Range<String.Index>(start: myString.startIndex.advancedBy(fromIndex), end: myString.startIndex.advancedBy(fromIndex+arr[wordNum-1].characters.count))
}
}
}
let myString = "word1 word2 word3 word2 word1 word3 word1"
let wordNum = 7 // 2nd instance (out of 3) of "word1"
if let myRange = myString.rangeOfNthWord(wordNum, wordSeparator: " ") {
// myRange: 36..<41
print(myString.substringWithRange(myRange)) // prints "word1"
}
You can tweak the .rangeOfNthWord(...)
method if word separation is not unique (say some words are separated by two blankspaces " "
).
Also pointed out in the comments below, the use of .rangeOfString(...)
is not, per se, pure Swift. It is, however, by no means bad practice. From Swift Language Guide - Strings and Characters:
Swift’s String type is bridged with Foundation’s NSString class. If you are working with the Foundation framework in Cocoa, the entire NSString API is available to call on any String value you create when type cast to NSString, as described in AnyObject. You can also use a String value with any API that requires an NSString instance.
See also the NSString class reference for rangeOfString method:
// Swift Declaration:
func rangeOfString(_ searchString: String) -> NSRange
I went ahead and wrote the state machine. (Grumble..) FWIW, here it is:
extension String {
private func halfOpenIntervalOfBlock(n:Int, separator sep:Character? = nil) -> (Int, Int)? {
enum State {
case InSeparator
case InPrecedingSeparator
case InWord
case InTarget
case Done
}
guard n > 0 else {
return nil
}
var state:State
if n == 1 {
state = .InPrecedingSeparator
} else {
state = .InSeparator
}
var separatorNum = 0
var startIndex:Int = 0
var endIndex:Int = 0
for (i, c) in self.characters.enumerate() {
let inSeparator:Bool
// A bit inefficient to keep doing this test.
if let s = sep {
inSeparator = c == s
} else {
inSeparator = c == " " || c == "\n"
}
endIndex = i
switch state {
case .InPrecedingSeparator:
if !inSeparator {
state = .InTarget
startIndex = i
}
case .InTarget:
if inSeparator {
state = .Done
}
case .InWord:
if inSeparator {
separatorNum += 1
if separatorNum == n - 1 {
state = .InPrecedingSeparator
} else {
state = .InSeparator
}
}
case .InSeparator:
if !inSeparator {
state = .InWord
}
case .Done:
break
}
if state == .Done {
break
}
}
if state == .Done {
return (startIndex, endIndex)
} else if state == .InTarget {
return (startIndex, endIndex + 1) // We ran off end.
} else {
return nil
}
}
func rangeOfWord(n:Int) -> Range<Index>? {
guard let (s, e) = self.halfOpenIntervalOfBlock(n) else {
return nil
}
let ss = self.startIndex.advancedBy(s)
let ee = self.startIndex.advancedBy(e)
return Range(start:ss, end:ee)
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With