My string delimiter is ;
. Delimiter is escaped in the string as \;
. E.g.,
irb(main):018:0> s = "a;b;;d\\;e"
=> "a;b;;d\\;e"
irb(main):019:0> s.split(';')
=> ["a", "b", "", "d\\", "e"]
Could someone suggest me regex so the output of split would be ["a", "b", "", "d\\;e"]
? I'm using Ruby 1.8.7
1.8.7 doesn't have negative lookbehind without Oniguruma (which may be compiled in).
1.9.3; yay:
> s = "a;b;c\\;d"
=> "a;b;c\\;d"
> s.split /(?<!\\);/
=> ["a", "b", "c\\;d"]
1.8.7 with Oniguruma doesn't offer a trivial split, but you can get match offsets and pull apart the substrings that way. I assume there's a better way to do this I'm not remembering:
> require 'oniguruma'
> re = Oniguruma::ORegexp.new "(?<!\\\\);"
> s = "hello;there\\;nope;yestho"
> re.match_all s
=> [#<MatchData ";">, #<MatchData ";">]
> mds = re.match_all s
=> [#<MatchData ";">, #<MatchData ";">]
> mds.collect {|md| md.offset}
=> [[5, 6], [17, 18]]
Other options include:
;
and post-processing the results looking for trailing \\
, orAs @dave-newton answered, you could use negative lookbehind, but that isn't supported in 1.8. An alternative that will work in both 1.8 and 1.9, is to use String#scan instead of split, with a pattern accepting not (semicolon or backslash) or anychar prefixed by backlash:
$ irb
>> RUBY_VERSION
=> "1.8.7"
>> s = "a;b;c\\;d"
=> "a;b;c\\;d"
s.scan /(?:[^;\\]|\\.)+/
=> ["a", "b", "c\\;d"]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With