Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retaining the pattern characters while splitting via Regex, Ruby

I have the following string

str="HelloWorld How areYou I AmFine"

I want this string into the following array

["Hello","World How are","You I Am", "Fine"]

I have been using the following regex, it splits correctly but it also omits the matching pattern, i also want to retain that pattern. What i get is

str.split(/[a-z][A-Z]/)
 => ["Hell", "orld How ar", "ou I A", "ine"] 

It omitts the matching pattern.

Can any one help me out how to retain these characters as well in the resulting array

like image 285
Nadeem Yasin Avatar asked Apr 03 '12 11:04

Nadeem Yasin


2 Answers

In Ruby 1.9 you can use positive lookahead and positive lookbehind (lookahead and lookbehind regex constructs are also called zero-width assertions). They match characters, but then give up the match and only return the result, thus you won't loose your border characters:

str.split /(?<=[a-z])(?=[A-Z])/
=> ["Hello", "World How are", "You I Am", "Fine"] 

Ruby 1.8 does not support lookahead/lookbehind constructs. I recommend to use ruby 1.9 if possible.

If you are forced to use ruby 1.8.7, I think regex won't help you and the best solution I can think of is to build a simple state machine: iterate over each character in your original string and build first string until you encounter border condition. Then build second string etc.

like image 197
Aliaksei Kliuchnikau Avatar answered Nov 18 '22 14:11

Aliaksei Kliuchnikau


Three answers so far, each with a limitation: one is rails-only and breaks with underscore in original string, another is ruby 1.9 only, the third always has a potential error with its special character. I really liked the split on zero-width assertion answer from @Alex Kliuchnikau, but the OP needs ruby 1.8 which doesn't support lookbehind. There's an answer that uses only zero-width lookahead and works fine in 1.8 and 1.9 using String#scan instead of #split.

str.scan /.*?[a-z](?=[A-Z]|$)/
=> ["Hello", "World How are", "You I Am", "Fine"]
like image 26
dbenhur Avatar answered Nov 18 '22 14:11

dbenhur