Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sort unique urls from log

I need to get the unique URLs from a web log and then sort them. I was thinking of using grep, uniq, sort command and output this to another file

I executed this command:

cat access.log | awk '{print $7}' > url.txt

then only get the unique one and sort them:

cat url.txt | uniq | sort > urls.txt

The problem is that I can see duplicates, even though the file is sorted which means my command worked. Why?

like image 432
aki Avatar asked Nov 17 '11 16:11

aki


1 Answers

uniq | sort does not work: uniq removes contiguous duplicates.

The correct way is sort | uniq or better sort -u. Because only one process is spawned.

like image 57
mouviciel Avatar answered Oct 04 '22 18:10

mouviciel