Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract groups matched regex to array in scala

I got this problem. I have a

val line:String = "PE018201804527901"

that matches with this

regex : (.{2})(.{4})(.{9})(.{2})

I need to extract each group from the regex to an Array.

The result would be:

Array["PE", "0182","018045279","01"]

I try to do this regex:

val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val x= regex.findAllIn(line).toArray

but it doesn't work!

like image 901
Will Avatar asked May 11 '17 08:05

Will


2 Answers

regex.findAllIn(line).subgroups.toArray
like image 167
sheunis Avatar answered Sep 20 '22 07:09

sheunis


Note that findAllIn does not automatically anchor the regex pattern, and will find a match inside a much longer string. If you need to only allow matches inside 17 char strings, you can use a match block like this:

val line = "PE018201804527901"
val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val results = line match {
  case regex(g1, g2, g3, g4) => Array(g1, g2, g3, g4)
  case _ => Array[String]()
}
// Demo printing
results.foreach { m =>
  println(m)
} 
// PE
// 0182
// 018045279
// 01

See a Scala demo.

It also handles no match scenario well initializing an empty string array.

If you need to get all matches and all groups, then you will need to grab the groups into a list and then add the list to a list buffer (scala.collection.mutable.ListBuffer):

val line = "PE018201804527901%E018201804527901"
val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val results = ListBuffer[List[String]]()

val mi = regex.findAllIn(line)
while (mi.hasNext) {
  val d = mi.next
  results += List(mi.group(1), mi.group(2), mi.group(3), mi.group(4))
}
// Demo printing
results.foreach { m =>
  println("------")
  println(m)
  m.foreach { l => println(l) }
}

Results:

------
List(PE, 0182, 018045279, 01)
PE
0182
018045279
01
------
List(%E, 0182, 018045279, 01)
%E
0182
018045279
01

See this Scala demo

like image 28
Wiktor Stribiżew Avatar answered Sep 20 '22 07:09

Wiktor Stribiżew