Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return all the indexes of a particular substring

Is there a Scala library API method (and if not, an idiomatic way) to obtain a list of all the indexes for a substring (target) within a larger string (source)? I have tried to look through the ScalaDoc, but was not able to find anything obvious. There are SO many methods doing so many useful things, I am guessing I am just not submitting the right search terms.

For example, if I have a source string of "name:Yo,name:Jim,name:name,name:bozo" and I use a target string of "name:", I would like to get back a List[Int] of List(0, 8, 17, 27).

Here's my quick hack to resolve the problem:

def indexesOf(source: String, target: String, index: Int = 0, withinOverlaps: Boolean = false): List[Int] = {
    def recursive(index: Int, accumulator: List[Int]): List[Int] = {
      if (!(index < source.size)) accumulator
      else {
        val position = source.indexOf(target, index)
        if (position == -1) accumulator
        else {
          recursive(position + (if (withinOverlaps) 1 else target.size), position :: accumulator)
        }
      }
    }

    if (target.size <= source.size) {
      if (!source.equals(target)) {
        recursive(0, Nil).reverse
      }
      else List(0)
    }
    else Nil
  }

Any guidance you can give me replacing this with a proper standard library entry point would be greatly appreciated.

UPDATE 2019/Jun/16:

Further code tightening:

  def indexesOf(source: String, target: String, index: Int = 0, withinOverlaps: Boolean = false): List[Int] = {
    def recursive(indexTarget: Int = index, accumulator: List[Int] = Nil): List[Int] = {
      val position = source.indexOf(target, indexTarget)
      if (position == -1)
        accumulator
      else
        recursive(position + (if (withinOverlaps) 1 else target.size), position :: accumulator)
    }
    recursive().reverse
  }

UPDATE 2014/Jul/22:

Inspired by Siddhartha Dutta's answer, I tighted up my code. It now looks like this:

  def indexesOf(source: String, target: String, index: Int = 0, withinOverlaps: Boolean = false): List[Int] = {
    @tailrec def recursive(indexTarget: Int, accumulator: List[Int]): List[Int] = {
      val position = source.indexOf(target, indexTarget)
      if (position == -1) accumulator
      else
        recursive(position + (if (withinOverlaps) 1 else target.size), position :: accumulator)
    }
    recursive(index, Nil).reverse
  }

Additionally, if I have a source string of "aaaaaaaa" and I use a target string of "aa", I would like by default to get back a List[Int] of List(0, 2, 4, 6) which skips a search starting inside of a found substring. The default can be overridden by passing "true" for the withinOverlaps parameter which in the "aaaaaaaa"/"aa" case would return List(0, 1, 2, 3, 4, 5, 6).

like image 419
chaotic3quilibrium Avatar asked Jul 21 '14 20:07

chaotic3quilibrium


People also ask

How do you find all the indexes of a substring in a string?

finditer() The finditer function of the regex library can help us perform the task of finding the occurrences of the substring in the target string and the start function can return the resultant index of each of them.

How do you find the index of all occurrences in a string?

Using indexOf() and lastIndexOf() method The String class provides an indexOf() method that returns the index of the first appearance of a character in a string. To get the indices of all occurrences of a character in a String, you can repeatedly call the indexOf() method within a loop.

How do you find the indices of a substring in Python?

Python has string. find() and string. rfind() to get the index of a substring in a string.

How do you return a string index in Python?

Python String index()The index() method returns the index of a substring inside the string (if found). If the substring is not found, it raises an exception.


2 Answers

I am always inclined to reach into the bag of regex tricks with problems like this one. I wouldn't say it is proper, but it's a hell of a lot less code. :)

val r = "\\Qname\\E".r
val ex = "name:Yo,name:Jim,name:name,name:bozo"

val is = r.findAllMatchIn(ex).map(_.start).toList

The quotes \\Q and \\E aren't necessary for this case, but if the string you're looking for has any special characters, then it will be.

like image 103
joescii Avatar answered Nov 15 '22 03:11

joescii


A small code to get all the indexes
call the below method as getAllIndexes(source, target)

def getAllIndexes(source: String, target: String, index: Int = 0): List[Int] = {
        val targetIndex = source.indexOf(target, index)
        if(targetIndex != -1)
          List(targetIndex) ++ getAllIndexes(source, target, targetIndex+1)
        else
          List()
      }
like image 44
Siddhartha Dutta Avatar answered Nov 15 '22 05:11

Siddhartha Dutta