Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return array of replacements from ruby

I want to take the string foofoofoo, map foo to bar, and return all individual replacements as an array - ['barfoofoo', 'foobarfoo', 'foofoobar']

This is the best I have:

require 'pp'
def replace(string, pattern, replacement)
  results = []
  string.length.times do |idx|
    match_index = (Regexp.new(pattern) =~ string[idx..-1])
    next unless match_index
    match_index = idx + match_index
    prefix = ''
    if match_index > 0
      prefix = string[0..match_index - 1]
    end

    suffix = ''
    if match_index < string.length - pattern.length - 1
      suffix = string[match_index + pattern.length..-1]
    end

    results << prefix + replacement + suffix
  end
  results.uniq
end

pp replace("foofoofoo", 'foo', 'bar')

This works (at least for this test case), but seems too verbose and hacky. Can I do better, perhaps by using string#gsub with a block or some such?

like image 976
Anand Avatar asked Feb 19 '26 19:02

Anand


2 Answers

It is easy to do with pre_match ($`) and post_match ($'):

    def replace_matches(str, re, repl)
      return enum_for(:replace_matches, str, re, repl) unless block_given?
      str.scan(re) do
        yield "#$`#{repl}#$'"
      end
    end

    str = "foofoofoo"

    # block usage
    replace_matches(str, /foo/, "bar") { |x| puts x }

    # enum usage
    puts replace_matches(str, /foo/, "bar").to_a

EDIT: If you have overlapping matches, then it becomes harder, as regular expressions aren't really equipped to deal with it. So you can do it like this:

def replace_matches(str, re, repl)
  return enum_for(:replace_matches, str, re, repl) unless block_given?
  re = /(?=(?<pattern>#{re}))/
  str.scan(re) do
    pattern_start = $~.begin(0)
    pattern_end = pattern_start + $~[:pattern].length
    yield str[0 ... pattern_start] + repl + str[pattern_end .. -1]
  end
end

str = "oooo"
replace_matches(str, /oo/, "x") { |x| puts x }

Here we abuse positive lookahead, which are 0-width, so we can get overlapping matches. However, we also need to know how many characters we matched, which we can't do as before now that match is 0-width, so we'll make a new capture of the contents of the lookahead, and calculate the new width from that.

(Disclaimer: it will still only match once per character; if you want to consider multiple possibilities at each character, like in your /f|o|fo/ case, it complicates things yet more.)

EDIT: A bit of a tweak and we can even support proper gsub-like behaviour:

def replace_matches(str, re, repl)
  return enum_for(:replace_matches, str, re, repl) unless block_given?
  new_re = /(?=(?<pattern>#{re}))/
  str.scan(new_re) do
    pattern_start = $~.begin(0)
    pattern_end = pattern_start + $~[:pattern].length
    new_repl = str[pattern_start ... pattern_end].gsub(re, repl)
    yield str[0 ... pattern_start] + new_repl + str[pattern_end .. -1]
  end
end

str = "abcd"
replace_matches(str, /(?<first>\w)(?<second>\w)/, '\k<second>\k<first>').to_a
# => ["bacd", "acbd", "abdc"]

(Disclaimer: the last snippet can't handle cases where the pattern uses lookbehind or lookahead to check outside the match region.)

like image 103
Amadan Avatar answered Feb 21 '26 07:02

Amadan


I don't think Ruby provides such a functionality out of the box. However, here's my two cents, which may be more elegant:

def replace(str, pattern, replacement)
  count = str.scan(pattern).count
  fragments = str.split(pattern, -1)

  count.times.map do |occurrence|
    fragments[0..occurrence].join(pattern)
      .concat(replacement)
      .concat(fragments[(occurrence+1)..count].to_a.join(pattern))
  end
end
like image 25
Tamer Shlash Avatar answered Feb 21 '26 08:02

Tamer Shlash



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!