Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove substrings in ruby

Tags:

string

regex

ruby

Given an array of strings,

array1 = ["abcdwillbegoneabcccc","cdefwilbegokkkabcdc"]

and another array of strings which consist of patterns e.g. ["abcd","beg[o|p]n","bcc","cdef","h*gxwy"]

the task is to remove substrings that match any of the pattern strings. for example a sample output for this case should be:

["willbegonea","wilbegokkk"]

because we have removed the substrings (prematch or postmatch as is appropriate depending on the position of occurrence) that matched one of the patterns. Assume that the one or two matches will always occur at the beginning or towards the end of each string in array1.

Any ideas of an elegant solution to the above in ruby?

like image 710
eastafri Avatar asked Dec 30 '22 03:12

eastafri


2 Answers

How about building a single Regex?

array1 = ["abcdwillbegoneabcccc","cdefwilbegokkkabcdc"]

to_remove = ["abcd","beg[o|p]n","bcc","cdef","h*gxwy"]

reg = Regexp.new(to_remove.map{ |s| "(#{s})" }.join('|'))
#=> /(abcd)|(beg[o|p]n)|(bcc)|(cdef)|(h*gxwy)/

array1.map{ |s| s.gsub(reg, '') }
#=>  ["willeacc", "wilbegokkkc"]

Note that my result is different to your

["willbegonea","wilbegokkk"]

but I think mine's correct, it removes "abcd", "begon" and "bcc" from the original, which seems to be what's wanted.

like image 174
Mike Woodhouse Avatar answered Jan 16 '23 07:01

Mike Woodhouse


I can see some potential gotchas here, in that if you change the order of the pattern strings, you could get a different result; and also, the second pattern might leave the string in a state that would have matched the first one, only it's too late now.

Assuming those are givens, I would go with Yoann's answer. The only way I can slightly improve it is to make the patterns regexen rather than strings, like this:

[/abcd/,/beg[o|p]n/,/bcc/,/cdef/,/h*gxwy/].each do |pattern|
    string_to_test.gsub!(pattern,'')
end

But of course if the patterns are coming from somewhere else, maybe they have to be strings.

like image 32
Shadowfirebird Avatar answered Jan 16 '23 06:01

Shadowfirebird