I need to get the unique URLs from a web log and then sort them. I was thinking of using grep, uniq, sort command and output this to another file
I executed this command:
cat access.log | awk '{print $7}' > url.txt
then only get the unique one and sort them:
cat url.txt | uniq | sort > urls.txt
The problem is that I can see duplicates, even though the file is sorted which means my command worked. Why?
uniq | sort
does not work: uniq
removes contiguous duplicates.
The correct way is sort | uniq
or better sort -u
. Because only one process is spawned.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With