Is there a way to delete duplicate lines in a file in Unix? I can do it with <code>sort -u</code> and <code>uniq</code> commands, but I want to use <code>sed</code> or <code>awk</code>. Is that possible?

<pre class="prettyprint"><code>awk '!seen[$0]++' file.txt </code></pre> <code>seen</code> is an associative array that AWK will pass every line of the file to. If a line isn't in the array then <code>seen[$0]</code> will evaluate to false. The <code>!</code> is the logical NOT operator and will invert the false to true. AWK will print the lines where the expression evaluates to true. The <code>++</code> increments <code>seen</code> so that <code>seen[$0] == 1</code> after the first time a line is found and then <code>seen[$0] == 2</code>, and so on. AWK evaluates everything but <code>0</code> and <code>""</code> (empty string) to true. If a duplicate line is placed in <code>seen</code> then <code>!seen[$0]</code> will evaluate to false and the line will not be written to the output.

How to delete duplicate lines in a file without sorting it in Unix

1 Answers

awk '!seen[$0]++' file.txt

seen is an associative array that AWK will pass every line of the file to. If a line isn't in the array then seen[$0] will evaluate to false. The ! is the logical NOT operator and will invert the false to true. AWK will print the lines where the expression evaluates to true.

The ++ increments seen so that seen[$0] == 1 after the first time a line is found and then seen[$0] == 2, and so on. AWK evaluates everything but 0 and "" (empty string) to true. If a duplicate line is placed in seen then !seen[$0] will evaluate to false and the line will not be written to the output.

364

answered Oct 26 '22 07:10

Jonas Elfström

Related questions
                            
                                What does the line "#!/bin/sh" mean in a UNIX shell script?
                            
                                How to remove EXIF data without recompressing the JPEG?
                            
                                How to only find files in a given directory, and ignore subdirectories using bash
                            
                                How do I use grep to search the current directory for all files having the a string "hello" yet display only .h and .cc files?
                            
                                Copy all files with a certain extension from all subdirectories
                            
                                Command line: piping find results to rm
                            
                                cd into directory without having permission
                            
                                How do I find the location of the executable in C? [duplicate]
                            
                                How to strip leading "./" in unix "find"?
                            
                                How can I use a Python script in the command line without cd-ing to its directory? Is it the PYTHONPATH?
                            
                                Bash array with spaces in elements
                            
                                Command-line Unix ASCII-based charting / plotting tool
                            
                                count (non-blank) lines-of-code in bash
                            
                                How to run Unix shell script from Java code?
                            
                                UNIX export command [closed]
                            
                                Run php script as daemon process
                            
                                Creating temporary files in bash
                            
                                How do I turn off the output from tar commands on Unix? [closed]
                            
                                Delete all but the most recent X files in bash
                            
                                What generates the "text file busy" message in Unix?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to delete duplicate lines in a file without sorting it in Unix

Tags:

shell

unix

scripting

sed

awk

Vijay

People also ask

1 Answers

Jonas Elfström

Recent Activity

Donate For Us