I come to you with a problem that has me stumped. I'm attempting to find the number of lines in a file (in this case, the html of a certain site) longer than x (which, in this case, is 80).
For example: google.com has (by checking with wc -l) has 7 lines, two of which are longer than 80 (checking with awk '{print NF}'). I'm trying to find a way to check how many lines are longer than 80, and then outputting that number.
My command so far looks like this:
wget -qO - google.com | awk '{print NF}' | sort -g
I was thinking of just counting which lines have values larger than 80, but I can't figure out the syntax for that. Perhaps 'awk'? Maybe I'm going about this the clumsiest way possible and have hit a wall for a reason.
Thanks for the help!
Edit: The unit of measurement are characters. The command should be able to find the number of lines with more than 80 characters in them.
If you want the number of lines that are longer than 80 characters (your question is missing the units), grep
is a good candidate:
grep -c '.\{80\}'
So:
wget -qO - google.com | grep -c '.\{80\}'
outputs 6.
Blue Moon's answer (in its original version) will print the number of fields, not the length of the line. Since the default field separator in awk is ' '
(space) you will get a word count, not the length of the line.
Try this:
wget -q0 - google.com | awk '{ if (length($0) > 80) count++; } END{print count}'
Using awk:
wget -qO - google.com | awk 'NF>80{count++} END{print count}'
This gives 2
as output as there are two lines with more than 80 fields.
If you mean number of characters (I presumed fields based on what you have in the question) then:
wget -qO - google.com | awk 'length($0)>80{c++} END{print c}'
which gives 6
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With