Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can regex match all the words outside quotation marks?

I recently typed an essay for my lit class, and my teacher specifically stated a word limit that does not include quotations from the piece. And I thought, why not make a script that calculates that for you? I could, of course, do this the boring way by going though the whole text and ignoring the words inside quotation marks, but I have a feeling that there's a neater way using Regex and Array.count. As I know next to nothing about Regex, can someone help me/tell me that it's impossible with Regex?

Tl;dr: use Regex to match all words (or spaces, doesn't matter) that are outside quotation marks from a text, and count the items in the resulting array.

like image 508
Bluefire Avatar asked Oct 22 '25 06:10

Bluefire


1 Answers

Depending on the requirements, could use The Greatest Regex Trick Ever

"[^"]*"|(\w+)

And count the matches of the first capture group.

\w+ matches one or more word characters.

See test at regex101.com


Also skip single quoted strings:

"[^"]*"|'[^']*'|(\w+)

test at regex101

like image 65
Jonny 5 Avatar answered Oct 24 '25 03:10

Jonny 5



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!