Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BBcode regex for **bold text**

Tags:

regex

I'm terrible with regex, but I've had a try and a Google (and even looked in reddit's source) and I'm still stuck so here goes:

My aim is to match the following 'codes' and replace them with the HTML tags. It's just the regex I'm stuck with.

**bold text**
_italic text_
~hyperlink~

Here's my attempts at the bold one:

^\*\*([.^\*]+)\*\*$

Why this isn't working? I'm using the preg syntax.

like image 411
Ross Avatar asked Nov 19 '08 22:11

Ross


2 Answers

use:

\*\*(.[^*]*)\*\*

explanation:

\*\*      // match two *'s
(.        // match any character
[^*]      // that is not a *
*)        // continuation of any character
\*\*      // match two *'s

in a character class "[ ]" "^" is only significant if it's the first character. so (.*) matches anything, (.[^*]*) is match anything until literal *

edit: in response to comments to match asterisk within (ie **bold *text**), you'd have to use a non greedy match:

\*\*(.*?)\*\*

character classes are more efficient non greedy matches, but it's not possible to group within a character class (see "Parentheses and Backreferences...")

like image 118
Owen Avatar answered Oct 02 '22 13:10

Owen


First of all, get rid of the ^ and the $. Using those will only match a string that starts with ** and ends with **. Second, use the greedy quantifier to match as little text as possible, instead of making a character class for all characters other than asterisks.

Here's what I suggest:

\*\*(.+?)\*\*
like image 38
Paige Ruten Avatar answered Oct 02 '22 12:10

Paige Ruten