Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Notepad++ Regex: Find all 1 and 2 letter words

I’m working with a text file with 200.000+ lines in Notepad++. Each line has only one word. I need to strip out and remove all words which only contains one letter (e.g.: I) and words which contains only two letters (e.g.: as).

I thought I could just pas in regular regex like this [a-zA-Z]{1,2} but I does not recognize anything (I’m trying to Mark them).

I’ve done manual search and I know that there do exists words of that length so therefor can it only be my regex code that’s wrong. Anyone knows how to do this in Notepad++ ???

Cheers,
- Mestika

like image 707
Emil Devantie Brockdorff Avatar asked Dec 26 '22 16:12

Emil Devantie Brockdorff


1 Answers

If you want to remove only the words but leave the lines empty, this works:

^[a-zA-Z]{1,2}$

Replace this with an empty string. ^ and $ are anchors for the beginning and the end of a line (because Notepad++'s regexes work in multi-line mode).

If you want to remove the lines completely, search for this:

^[a-zA-Z]{1,2}\r\n

And replace with an empty string. However, this won't work before Notepad++ 6, so make sure yours is up-to-date.

Note that you will have to replace \r\n with the specific line-endings of your file!

As Tim Pietzker suggested, a platform independent solution that also removes empty lines would be:

^[a-zA-Z]{1,2}[\r\n]+

A platform-independent solution that does not remove empty lines but only those with one or two letters would be:

^[a-zA-Z]{1,2}(\r\n?|\n)
like image 54
Martin Ender Avatar answered Jan 12 '23 20:01

Martin Ender