Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to parse image link in Markdown

Tags:

regex

I'm trying to create regex to parse markdown links.

regex:

!\[[^\]]*\]\((.*)\s"(.*[^"])"?\s*\)

Test (link to live demo):

foo

![](image 2.png "hello world")

bar

Group 1 will be image 2.png, and group 2 will be hello world.

The problem appears when I try to parse a link without title:

foo

![](image 2.png)

bar

How I should modify regex to make it work in both cases?

like image 731
john c. j. Avatar asked May 28 '17 12:05

john c. j.


3 Answers

You have to make the second group optional since it's not always there. Also, you can achieve a little bit better readability with named groups, something like this perhaps:

!\[[^\]]*\]\((?<filename>.*?)(?=\"|\))(?<optionalpart>\".*\")?\)

https://regex101.com/r/cSbfvF/3/

Alternatively, your original regex fixed up would be:

!\[[^\]]*\]\((.*?)\s*("(?:.*[^"])")?\s*\)

https://regex101.com/r/u2DwY2/2/

like image 75
Scott Weaver Avatar answered Nov 04 '22 19:11

Scott Weaver


Here's a complete regexp to match both the Alt text and the image url in a markdown file with a named capture group:

(?<alt>!\[[^\]]*\])\((?<filename>.*?)(?=\"|\))\)
like image 45
Divine Hycenth Avatar answered Nov 04 '22 18:11

Divine Hycenth


The previously accepted answer only accounts for standard images, it's possible however that images could be used as links for hyperlinks, resulting in a nested image reference, such as:

![alt-text](http://example.com/image.png "image title")](http://example.com/some?target)

A more complete regex pattern would like like this:

\[?(!)(?'alt'\[[^\]\[]*\[?[^\]\[]*\]?[^\]\[]*)\]\((?'url'[^\s]+?)(?:\s+(["'])(?'title'.*?)\4)?\)

This pattern also provides named groups for all the potential other info you might want about the image, such as "alt text" or "title".

like image 1
Doug Avatar answered Nov 04 '22 19:11

Doug