How do I fix this multiline regular expression in Ruby?

Tags:

I have a regular expression in Ruby that isn't working properly in multiline mode.

I'm trying to convert Markdown text into the Textile-eque markup used in Redmine. The problem is in my regular expression for converting code blocks. It should find any lines leading with 4 spaces or a tab, then wrap them in pre tags.

markdownText = '# header

some text that precedes code

    var foo = 9;
    var fn = function() {}

    fn();

some post text'

puts markdownText.gsub!(/(^(?:\s{4}|\t).*?$)+/m,"<pre>\n\\1\n</pre>")

Intended result:

# header

some text that precedes code

<pre>
    var foo = 9;
    var fn = function() {}

    fn();
</pre>

some post text

The problem is that the closing pre tag is printed at the end of the document instead of after "fn();". I tried some variations of the following expression but it doesn't match:

gsub!(/(^(?:\s{4}|\t).*?$)+^(\S)/m, "<pre>\n\\1\n</pre>\\2")

How do I get the regular expression to match just the indented code block? You can test this regular expression on Rubular here.

313

asked Apr 19 '11 16:04

DonovanChan

1 Answers

First, note that 'm' multi-line mode in Ruby is equivalent to 's' single-line mode of other languages. In other words; 'm' mode in Ruby means: "dot matches all".

This regex will do a pretty good job of matching a markdown-like code section:

re = / # Match a MARKDOWN CODE section.
    (\r?\n)              # $1: CODE must be preceded by blank line
    (                    # $2: CODE contents
      (?:                # Group for multiple lines of code.
        (?:\r?\n)+       # Each line preceded by a newline,
        (?:[ ]{4}|\t).*  # and begins with four spaces or tab.
      )+                 # One or more CODE lines
      \r?\n              # CODE folowed by blank line.
    )                    # End $2: CODE contents
    (?=\r?\n)            # CODE folowed by blank line.
    /x
result = subject.gsub(re, '\1<pre>\2</pre>')

This requires a blank line before and after the code section and allows blank lines within the code section itself. It allows for either \r\n or \n line terminations. Note that this does not strip the leading 4 spaces (or tab) before each line. Doing that will require more code complexity. (I am not a ruby guy so can't help out with that.)

I would recommend looking at the markdown source itself to see how its really being done.

179

answered Nov 04 '22 01:11

ridgerunner

Related questions
                            
                                Cap invoke and sudo
                            
                                What are the current state of affairs on threading, concurrency and forked processes, in Ruby on Rails?
                            
                                Has and belongs to many relationship with multiple databases
                            
                                How to navigate the DOM using Nokogiri
                            
                                Rails Formbuilder Question
                            
                                Detecting regional settings (List Separator) from web
                            
                                Writing an ActiveRecord adapter
                            
                                Installing bcrypt-ruby gem on Windows
                            
                                How to give users a file storage limit?
                            
                                using `include?` in ruby to check if something is in a hash
                            
                                Match only beginning of line in Ruby regexp
                            
                                Rails: check if the model was really saved in after_save
                            
                                Interconversion between decimal and any other base-n number system
                            
                                regex to match trailing whitespace, but not lines which are entirely whitespace (indent placeholders)
                            
                                Finding the product of a variable number of Ruby arrays
                            
                                Rails distance_of_time_in_words returns "en, about_x_hours"
                            
                                Ruby - Thor execute a specific Task first
                            
                                Rails: How do I write a spec for a route that does a redirect?
                            
                                Ruby XML Builder, how to create this namespace?
                            
                                Multiple iterations

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do I fix this multiline regular expression in Ruby?

Tags:

regex

multiline

ruby

DonovanChan

People also ask

1 Answers

ridgerunner

Recent Activity

Donate For Us