Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert Markdown-style links using regex?

I'm trying to write a regular expression that replaces a markdown-style links but it doesn't seem to be working. This is what I have so far:

# ruby code:
text = "[link me up](http://www.example.com)"
text.gsub!(%r{\[(\+)\]\((\+)\)}x, %{<a target="_blank" href="\\1">\\2</a>})

What am I doing wrong?

like image 324
Andrew Avatar asked Feb 13 '12 21:02

Andrew


1 Answers

irb(main):001:0> text = "[link me up](http://www.example.com)"
irb(main):002:0> text.gsub /\[([^\]]+)\]\(([^)]+)\)/, '<a href="\2">\1</a>'
#=> "<a href=\"http://www.example.com\">link me up</a>"

We can use the extended option for Ruby's regex to make it not look like a cat jumped on the keyboard:

def linkup( str )
  str.gsub %r{
    \[         # Literal opening bracket
      (        # Capture what we find in here
        [^\]]+ # One or more characters other than close bracket
      )        # Stop capturing
    \]         # Literal closing bracket
    \(         # Literal opening parenthesis
      (        # Capture what we find in here
        [^)]+  # One or more characters other than close parenthesis
      )        # Stop capturing
    \)         # Literal closing parenthesis
  }x, '<a href="\2">\1</a>'
end

text = "[link me up](http://www.example.com)"
puts linkup(text)
#=> <a href="http://www.example.com">link me up</a>

Note that the above will fail for URLs that have a right parenthesis in them, e.g.

linkup "[O](http://msdn.microsoft.com/en-us/library/ms533050(v=vs.85).aspx)"
# <a href="http://msdn.microsoft.com/en-us/library/ms533050(v=vs.85">O</a>.aspx)

If this is important to you, you replace the [^)]+ with \S+(?=\)) which means "find as many non-whitespace-characters as you can, but ensure that there is a ) afterwards".


To answer your question "what am I doing wrong", here's what your regex said:

%r{
  \[      # Literal opening bracket   (good)
    (     # Start capturing           (good)
      \+  # A literal plus character  (OOPS)
    )     # Stop capturing            (good)
  \]      # Literal closing bracket   (good)
  \(      # Literal opening paren     (good)
    (     # Start capturing           (good)
      \+  # A literal plus character  (OOPS)
    )     # Stop capturing            (good)
  \)      # Literal closing paren     (good)
}x
like image 174
Phrogz Avatar answered Nov 20 '22 10:11

Phrogz