Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Groovy: is there a way to return all occurrences of a String as a List of Integer offsets?

Given a String, I know Groovy provides convenience methods like
String.findAll(String, Closure)

Finds all occurrences of a regular expression string within a String. Any matches are passed to the specified closure. The closure is expected to have the full match in the first parameter. If there are any capture groups, they will be placed in subsequent parameters.

However, I am looking for a similar method where the closure receives either the Matcher object or the int offset of the match. Is there such a beast?

Or, if not: is there a common way to return the offsets of all matches for a given String or Pattern as a Collection or Array of Integers / ints? (Commons / Lang or Guava are both OK, but I'd prefer plain Groovy).

like image 461
Sean Patrick Floyd Avatar asked Feb 26 '23 06:02

Sean Patrick Floyd


1 Answers

I don't know of anything that currently exists, but you could add the method to the metaClass of String if you wanted... Something like:

String.metaClass.allIndexOf { pat ->
  def (ret, idx) = [ [], -2 ]
  while( ( idx = delegate.indexOf( pat, idx + 1 ) ) >= 0 ) {
    ret << idx
  }
  ret
}

Which can be called by:

"Finds all occurrences of a regular expression string".allIndexOf 's'

and returns (in this case)

[4, 20, 40, 41, 46]

Edit

Actually...a version which can work with regular expression parameters would be:

String.metaClass.allIndexOf { pat ->
  def ret = []
  delegate.findAll pat, { s ->
    def idx = -2
    while( ( idx = delegate.indexOf( s, idx + 1 ) ) >= 0 ) {
      ret << idx
    }
  }
  ret
}

Which can then be called like:

"Finds all occurrences of a regular expression string".allIndexOf( /a[lr]/ )

to give:

[6, 32]

Edit 2

And finally this code as a Category

class MyStringUtils {
  static List allIndexOf( String str, pattern ) {
    def ret = []
    str.findAll pattern, { s ->
      def idx = -2
      while( ( idx = str.indexOf( s, idx + 1 ) ) >= 0 ) {
        ret << idx
      }
    }
    ret
  }
}

use( MyStringUtils ) {
  "Finds all occurrences of a regular expression string".allIndexOf( /a[lr]/ )
}
like image 172
tim_yates Avatar answered Feb 28 '23 17:02

tim_yates