Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linux Terminal: Finding number of lines longer than x

I come to you with a problem that has me stumped. I'm attempting to find the number of lines in a file (in this case, the html of a certain site) longer than x (which, in this case, is 80).

For example: google.com has (by checking with wc -l) has 7 lines, two of which are longer than 80 (checking with awk '{print NF}'). I'm trying to find a way to check how many lines are longer than 80, and then outputting that number.

My command so far looks like this: wget -qO - google.com | awk '{print NF}' | sort -g

I was thinking of just counting which lines have values larger than 80, but I can't figure out the syntax for that. Perhaps 'awk'? Maybe I'm going about this the clumsiest way possible and have hit a wall for a reason.

Thanks for the help!

Edit: The unit of measurement are characters. The command should be able to find the number of lines with more than 80 characters in them.

like image 824
Doestovsky Avatar asked Nov 19 '14 20:11

Doestovsky


3 Answers

If you want the number of lines that are longer than 80 characters (your question is missing the units), grep is a good candidate:

grep -c '.\{80\}'

So:

wget -qO - google.com | grep -c '.\{80\}'

outputs 6.

like image 160
gniourf_gniourf Avatar answered Nov 17 '22 17:11

gniourf_gniourf


Blue Moon's answer (in its original version) will print the number of fields, not the length of the line. Since the default field separator in awk is ' ' (space) you will get a word count, not the length of the line.

Try this:

wget -q0 - google.com | awk '{ if (length($0) > 80) count++; } END{print count}'
like image 42
philbrooksjazz Avatar answered Nov 17 '22 17:11

philbrooksjazz


Using awk:

wget -qO - google.com | awk 'NF>80{count++} END{print count}'

This gives 2 as output as there are two lines with more than 80 fields.

If you mean number of characters (I presumed fields based on what you have in the question) then:

wget -qO - google.com | awk 'length($0)>80{c++} END{print c}'

which gives 6.

like image 2
P.P Avatar answered Nov 17 '22 17:11

P.P