Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count occurrences of character per line/field on Unix

Given a file with data like this (ie stores.dat file)

sid|storeNo|latitude|longitude 2tt|1|-28.0372000t0|153.42921670 9|2t|-33tt.85t09t0000|15t1.03274200 

What is the command that would return the number of occurrences of the 't' character per line?

eg. would return:

count   lineNum    4       1    3       2    6       3 

Also, to do it by count of occurrences by field what is the command to return the following results?

eg. input of column 2 and character 't'

count   lineNum    1       1    0       2    1       3 

eg. input of column 3 and character 't'

count   lineNum    2       1    1       2    4       3 
like image 709
toop Avatar asked Dec 25 '11 11:12

toop


People also ask

How do you count occurrences in Unix?

Using grep -c alone will count the number of lines that contain the matching word instead of the number of total matches. The -o option is what tells grep to output each match in a unique line and then wc -l tells wc to count the number of lines. This is how the total number of matching words is deduced.

How do I count the number of characters in a Unix line?

Wc Command in Linux (Count Number of Lines, Words, and Characters) On Linux and Unix-like operating systems, the wc command allows you to count the number of lines, words, characters, and bytes of each given file or standard input and print the result.

How do you count specific occurrences of characters in a string?

First, we split the string by spaces in a. Then, take a variable count = 0 and in every true condition we increment the count by 1. Now run a loop at 0 to length of string and check if our string is equal to the word.


2 Answers

To count occurrence of a character per line you can do:

awk -F'|' 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"") "\t" NR}' file count lineNum 4       1 3       2 6       3 

To count occurrence of a character per field/column you can do:

column 2:

awk -F'|' -v fld=2 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"",$fld) "\t" NR}' file count lineNum 1       1 0       2 1       3 

column 3:

awk -F'|' -v fld=3 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"",$fld) "\t" NR}' file count lineNum 2       1 1       2 4       3 
  • gsub() function's return value is number of substitution made. So we use that to print the number.
  • NR holds the line number so we use it to print the line number.
  • For printing occurrences of particular field, we create a variable fld and put the field number we wish to extract counts from.
like image 141
jaypal singh Avatar answered Oct 08 '22 02:10

jaypal singh


grep -n -o "t" stores.dat | sort -n | uniq -c | cut -d : -f 1 

gives almost exactly the output you want:

  4 1   3 2   6 3 

Thanks to @raghav-bhushan for the grep -o hint, what a useful flag. The -n flag includes the line number as well.

like image 28
Gabriel Burt Avatar answered Oct 08 '22 04:10

Gabriel Burt