Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to match nth occurrence in a string using regular expression

How to match nth occurrence in a string using regular expression

set test {stackoverflowa is a best solution finding site stackoverflowb is a best solution finding site stackoverflowc is a best solution finding sitestackoverflowd is a best solution finding sitestackoverflowe is a best solution finding site}

regexp -all {stackoverflow} $test 

The above one give "5" as output

regexp {stackoverflow} $test 

The above one give stackoverflow as result, here it is matching first occurrence of stackoverflow (i.e) stackoverflowa

My requirement is i want to match 5th occurrence of stackoverflow (i.e) stackoverflowe from the above given string.

Please some one clarify my question..Thanks

Then another one question

like image 401
velpandian Avatar asked Oct 21 '22 16:10

velpandian


1 Answers

Try

set results [regexp -inline -all {stackoverflow.} $test]
# => stackoverflowa stackoverflowb stackoverflowc stackoverflowd stackoverflowe
puts [lindex $results 4]

I'll be back to explain this further shortly, making pancakes right now.

So.

The command returns a list (-inline) of all (-all) substrings of the string contained in test that match the string "stackoverflow" (less quotes) plus one character, which can be any character. This list is stored in the variable result, and by indexing with 4 (because indexing is zero-based), the fifth element of this list can be retrieved (and, in this case, printed).

The dot at the end of the expression wasn't in your expression: I added it to check that I really did get the right match. You can of course omit the dot to match "stackoverflow" exactly.

ETA (from Donal's comment): in many cases it's convenient to extract not the string itself, but its position and extent within the searched string. The -indices option gives you that (I'm not using the dot in the expression now: the index list makes it obvious which one of the "stackoverflow"s I'm getting anyway):

set indices [regexp -inline -all -indices {stackoverflow} $test]
# => {0 12} {47 59} {94 106} {140 152} {186 198}

You can then use string range to get the string match:

puts [string range $test {*}[lindex $indices 4]]

The lindex $indices 4 gives me the list 186 198; the {*} prefix makes the two elements in that list appear as two separate arguments in the invocation of string range.

like image 129
Peter Lewerin Avatar answered Oct 23 '22 22:10

Peter Lewerin