Is it possible to remove duplicated rows in Notepad++, leaving only a single occurrence of a line?

Notepad++ with the TextFX plugin can do this, provided you wanted to sort by line, and remove the duplicate lines at the same time. To install the TextFX in the latest release of Notepad++ you need to download it from here: https://sourceforge.net/projects/npp-plugins/files/TextFX The TextFX plugin used to be included in older versions of Notepad++, or be possible to add from the menu by going to <code>Plugins -> Plugin Manager -> Show Plugin Manager -> Available tab -> TextFX -> Install</code>. In some cases it may also be called <code>TextFX Characters</code>, but this is the same thing. The check boxes and buttons required will now appear in the menu under: <code>TextFX -> TextFX Tools</code>. Make sure "sort outputs only unique..." is checked. Next, select a block of text (<kbd>Ctrl</kbd>+<kbd>A</kbd> to select the entire document). Finally, click "sort lines case sensitive" or "sort lines case insensitive" <img src="https://i.stack.imgur.com/1qnvS.png" alt="menu layout in n++">

Since Notepad++ Version 6 you can use this regex in the search and replace dialogue: <pre class="prettyprint"><code>^(.*?)$\s+?^(?=.*^\1$) </code></pre> and replace with nothing. This leaves from all duplicate rows the last occurrence in the file. No sorting is needed for that and the duplicate rows can be anywhere in the file! You need to check the options "Regular expression" and ". matches newline": <img src="https://i.imgur.com/dY3LCMD.png" alt="Notepad++ Replace dialogue"> <ul> <li><code>^</code> matches the start of the line.</li> <li><code>(.*?)</code> matches any characters 0 or more times, but as few as possible (It matches exactly on row, this is needed because of the ". matches newline" option). The matched row is stored, because of the brackets around and accessible using <code>\1</code> </li> <li><code>$</code> matches the end of the line.</li> <li><code>\s+?^</code> this part matches all whitespace characters (newlines!) till the start of the next row ==> This removes the newlines after the matched row, so that no empty row is there after the replacement.</li> <li><code>(?=.*^\1$)</code> this is a positive lookahead assertion. This is the important part in this regex, a row is only matched (and removed), when there is exactly the same row following somewhere else in the file.</li> </ul>

Removing duplicate rows in Notepad++

2 Answers

Notepad++ with the TextFX plugin can do this, provided you wanted to sort by line, and remove the duplicate lines at the same time.

To install the TextFX in the latest release of Notepad++ you need to download it from here: https://sourceforge.net/projects/npp-plugins/files/TextFX

The TextFX plugin used to be included in older versions of Notepad++, or be possible to add from the menu by going to Plugins -> Plugin Manager -> Show Plugin Manager -> Available tab -> TextFX -> Install. In some cases it may also be called TextFX Characters, but this is the same thing.

The check boxes and buttons required will now appear in the menu under: TextFX -> TextFX Tools.

Make sure "sort outputs only unique..." is checked. Next, select a block of text (Ctrl+A to select the entire document). Finally, click "sort lines case sensitive" or "sort lines case insensitive"

menu layout in n++

108

answered Sep 22 '22 07:09

Colin Pickard

Since Notepad++ Version 6 you can use this regex in the search and replace dialogue:

^(.*?)$\s+?^(?=.*^\1$)

and replace with nothing. This leaves from all duplicate rows the last occurrence in the file.

No sorting is needed for that and the duplicate rows can be anywhere in the file!

You need to check the options "Regular expression" and ". matches newline":

Notepad++ Replace dialogue

^ matches the start of the line.
(.*?) matches any characters 0 or more times, but as few as possible (It matches exactly on row, this is needed because of the ". matches newline" option). The matched row is stored, because of the brackets around and accessible using \1
$ matches the end of the line.
\s+?^ this part matches all whitespace characters (newlines!) till the start of the next row ==> This removes the newlines after the matched row, so that no empty row is there after the replacement.
(?=.*^\1$) this is a positive lookahead assertion. This is the important part in this regex, a row is only matched (and removed), when there is exactly the same row following somewhere else in the file.

answered Sep 20 '22 07:09

stema

Related questions
                            
                                How do I get a list of all the duplicate items using pandas in python?
                            
                                Drop all duplicate rows across multiple columns in Python Pandas
                            
                                How do I check if there are duplicates in a flat list?
                            
                                MySQL ON DUPLICATE KEY UPDATE for multiple rows insert in single query
                            
                                How do I remove duplicates from a C# array?
                            
                                Finding duplicate rows in SQL Server
                            
                                python pandas: Remove duplicates by columns A, keeping the row with the highest value in column B
                            
                                How to find duplicate records in PostgreSQL
                            
                                What's the most efficient way to erase duplicates and sort a vector?
                            
                                Remove pandas rows with duplicate indices
                            
                                Remove duplicate elements from array in Ruby
                            
                                Remove duplicate rows in MySQL
                            
                                Delete all Duplicate Rows except for One in MySQL? [duplicate]
                            
                                How do I (or can I) SELECT DISTINCT on multiple columns?
                            
                                How to delete duplicate rows in SQL Server?
                            
                                How do I remove repeated elements from ArrayList?
                            
                                Remove duplicates from a List<T> in C#
                            
                                How do I find the duplicates in a list and create another list with them?
                            
                                Find duplicate lines in a file and count how many time each line was duplicated?
                            
                                How to remove all duplicates from an array of objects?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Removing duplicate rows in Notepad++

Tags:

duplicates

notepad++

Przemysław Michalski

People also ask

2 Answers

Colin Pickard

stema

Recent Activity

Donate For Us