uniq command has an option "-d" which lists out only the duplicate records. sort command is used since the uniq command works only on sorted files. uniq command without the "-d" option will delete the duplicate records.
Use the -k option to sort on a certain column. For example, use “-k 2” to sort on the second column.
The uniq command accepts input from a text-based file and removes any repeated lines, only if they are adjacent to each other. That's why it's used in conjunction with sort to remove non-adjacent lines. Case differences can be ignored when dropping duplicate adjacent lines, using the -i option.
The uniq command can count and print the number of repeated lines. Just like duplicate lines, we can filter unique lines (non-duplicate lines) as well and can also ignore case sensitivity. We can skip fields and characters before comparing duplicate lines and also consider characters for filtering lines.
sort -u -t, -k1,1 file
-u
for unique-t,
so comma is the delimiter-k1,1
for the key field 1Test result:
[email protected],2009-11-27 00:58:29.793000000,xx3.net,255.255.255.0
[email protected],2009-11-27 01:05:47.893000000,xx2.net,127.0.0.1
awk -F"," '!_[$1]++' file
-F
sets the field separator.$1
is the first field._[val]
looks up val
in the hash _
(a regular variable).++
increment, and return old value.!
returns logical not.To consider multiple column.
Sort and give unique list based on column 1 and column 3:
sort -u -t : -k 1,1 -k 3,3 test.txt
-t :
colon is separator-k 1,1 -k 3,3
based on column 1 and column 3or if u want to use uniq:
<mycvs.cvs tr -s ',' ' ' | awk '{print $3" "$2" "$1}' | uniq -c -f2
gives:
1 01:05:47.893000000 2009-11-27 [email protected]
2 00:58:29.793000000 2009-11-27 [email protected]
1
If you want to retain the last one of the duplicates you could use
tac a.csv | sort -u -t, -r -k1,1 |tac
Which was my requirement
here
tac
will reverse the file line by line
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With