Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match all words inside parenthesis

Tags:

python

regex

Imagine this is a part of a large text:

stuff (word1/Word2/w0rd3) stuff, stuff (word4/word5) stuff/stuff (word6) stuff (word7/word8/word9) stuff / stuff, (w0rd10/word11) stuff stuff (word12) stuff (Word13/w0rd14/word15) stuff-stuff stuff (word16/word17).

I want the words. The result must matches:

word1
Word2
w0rd3
word4
word5
word6
word7
word8
word9
w0rd10
word11
word12
Word13
w0rd14
word15
word16
word17

Also the result should not be like:

(word1) or (word1/Word2/w0rd3) 

Basically no ( or ) or / allowed

What i have tried:

\((\w+)\/(\w+)\/(\w+)\)[^(]*\((\w+)\/(\w+)\)[^(]*\((\w+)\) 

regex101

This matches those words but i have to duplicate it as many word exist which is not clean. Also i tried txt2re but it is duplicated as well and it is not a one line regex. In case i want to use it on a online regex evaluator and no coding is in access then i need a one line and short regex. And my preferred engine is Python and C#.


Update: I have added some / in the text. Also sorry for changing the accepted answer, All answers are correct in some way, But i have to choose the fastest and most efficient regex here.

like image 269
0_o Avatar asked Jun 16 '19 11:06

0_o


2 Answers

A common solution is to check, if there is a closing ) ahead without any opening ( in between.

\w+\b(?=[^)(]*\))

See this demo at regex101

  • \w+ matches one or more word characters, followed by a \b word boundary
  • at the boundary: (?=[^)(]*\)) look if closing ) is ahead with any non ( ) in between

So this pattern does not check for an opening ( before, but often that's not needed.

like image 90
bobble bubble Avatar answered Sep 27 '22 20:09

bobble bubble


Instead of matching the words, you can write a regex that matches the non-words, and split by the regex:

\)?[^)]+?\(|\).+|/

A non-word is either:

  • an optional close parenthesis followed by a bunch of characters that are not close parentheses, followed by an opening parenthesis.
  • a closing parenthesis followed by some text (this is used to match the last bit of the string)
  • a slash

Regex Demo

like image 26
Sweeper Avatar answered Sep 27 '22 22:09

Sweeper