Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

search and replace with ruby regex

Tags:

regex

ruby

I have a text blob field in a MySQL column that contains HTML. I have to change some of the markup, so I figured I'll do it in a ruby script. Ruby is irrelevant here, but it would be nice to see an answer with it. The markup looks like the following:

<h5>foo</h5>
  <table>
    <tbody>
    </tbody>
  </table>

<h5>bar</h5>
  <table>
    <tbody>
    </tbody>
  </table>

<h5>meow</h5>
  <table>
    <tbody>
    </tbody>
  </table>

I need to change just the first <h5>foo</h5> block of each text to <h2>something_else</h2> while leaving the rest of the string alone.

Can't seem to get the proper PCRE regex, using Ruby.

like image 316
randombits Avatar asked Jan 16 '11 01:01

randombits


1 Answers

# The regex literal syntax using %r{...} allows / in your regex without escaping
new_str = my_str.sub( %r{<h5>[^<]+</h5>}, '<h2>something_else</h2>' )

Using String#sub instead of String#gsub causes only the first replacement to occur. If you need to dynamically choose what 'foo' is, you can use string interpolation in regex literals:

new_str = my_str.sub( %r{<h5>#{searchstr}</h5>}, "<h2>#{replacestr}</h2>" )

Then again, if you know what 'foo' is, you don't need a regex:

new_str = my_str.sub( "<h5>searchstr</h5>", "<h2>#{replacestr}</h2>" )

or even:

my_str[ "<h5>searchstr</h5>" ] = "<h2>#{replacestr}</h2>"

If you need to run code to figure out the replacement, you can use the block form of sub:

new_str = my_str.sub %r{<h5>([^<]+)</h5>} do |full_match|
  # The expression returned from this block will be used as the replacement string
  # $1 will be the matched content between the h5 tags.
  "<h2>#{replacestr}</h2>"
end
like image 90
Phrogz Avatar answered Oct 01 '22 22:10

Phrogz