sort unique urls from log

Question

I need to get the unique URLs from a web log and then sort them. I was thinking of using grep, uniq, sort command and output this to another file

I executed this command:

cat access.log | awk '{print $7}' > url.txt

then only get the unique one and sort them:

cat url.txt | uniq | sort > urls.txt

The problem is that I can see duplicates, even though the file is sorted which means my command worked. Why?

mouviciel · Accepted Answer

uniq | sort does not work: uniq removes contiguous duplicates.

The correct way is sort | uniq or better sort -u. Because only one process is spawned.

Donate For Us