Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split this string by regex?

Tags:

regex

split

scala

I have some string, they looks like:

div#title.title.top
#main.main
a.bold#empty.red

They are similar to haml, and I want to split them by regex, but I don't know how to define it.

val r = """???""".r // HELP
val items = "a.bold#empty.red".split(r)
items // -> "a", ".bold", "#empty", ".red"

How to do this?


UPDATE

Sorry, everyone, but I need to make this question harder. I'm very interested in

val r = """(?<=\w)\b"""

But it failed to parse the more complex ones:

div#question-title.title-1.h-222_333

I hope it will be parsed to:

div
#question-title
.title-1
.h-222_333 

I wanna know how to improve that regex?

like image 921
Freewind Avatar asked Mar 13 '11 01:03

Freewind


1 Answers

val r = """(?<=\w)\b(?!-)"""

Note that split takes a String representing a regular expression, not a Regex, so you must not convert r from String to Regex.

Brief explanation on the regex:

  • (?<=...) is a look-behind. It states that this match must be preceded by the pattern ..., or, in your case \w, meaning you want the pattern to follow a digit, letter, or underline.

  • \b means word boundary. It is a zero-length match that happen between a word character (digits, letters and underscore) and a non-word character, or vice versa. Because it is zero-length, split won't remove any character when splitting.

  • (?!...) is a negative-lookahead. Here I use to say that I'm not interested in word boundaries from a letter to a dash.

like image 154
Daniel C. Sobral Avatar answered Sep 25 '22 19:09

Daniel C. Sobral