I understand that grep -c string
can be used to count the occurrences of a given string. What I would like to do is count the number of unique occurrences when only part of the string is known or remains constant.
For Example, if I had a file (in this case a log) with several lines containing a constant string and a repeating variable like so:
string=value1
string=value1
string=value1
string=value2
string=value3
string=value2
Than I would like to be able to identify the number of each unique set with an output similar to the following: (ideally with a single grep/awk string)
value1 = 3 occurrences
value2 = 2 occurrences
value3 = 1 occurrences
Does anyone have a solution using grep or awk that might work? Thanks in advance!
This worked perfectly... Thanks to everyone for your comments!
grep -oP "wwn=[^,]*" path/to/file | sort | uniq -c
In general, if you want to grep and also keep track of results, it is best to use awk
since it performs such things in a clear manner with a very simple syntax.
So for your given file I would use:
$ awk -F= '/string=/ {count[$2]++} END {for (i in count) print i, count[i]}' file
value1 3
value2 2
value3 1
What is this doing?
-F=
=
, so that we can compute the right and left part of it./string=/ {count[$2]++}
count[]
to keep track on the times the second field has appeared so far.END {for (i in count) print i, count[i]}
Here's an awk script:
#!/usr/bin/awk -f
BEGIN {
file = ARGV[1]
while ((getline line < file) > 0) {
for (i = 2; i < ARGC; ++i) {
p = ARGV[i]
if (line ~ p) {
a[p] += !a[p, line]++
}
}
}
for (i = 2; i < ARGC; ++i) {
p = ARGV[i]
printf("%s = %d occurrences\n", p, a[p])
}
exit
}
Example:
awk -f script.awk somefile ab sh
Output:
ab = 7 occurrences
sh = 2 occurrences
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With