I am beginner to awk. I have created one file which contains employee information. There are employees in different departments. And i wanna count that how many employees in each department. like
marketing        3
sales            3
production       4
For that i used following command.
awk 'NR>1 {dept=$5} {count[dept]++} END {for (dept in count) {print dept count[dept]}}' emp
But above code it count and displays the first line i.e header also. like
marketing 3
sales 3
department 1
production 4
where department is a header of column which is also counted although i used NR>1.. And how to add space or increase the width of all columns.. because it looks like above output.. but i wanna display it properly.. So any solution for this?
Here is my input file
empid       empname     department
101         ayush    sales
102         nidhi    marketing
103         priyanka    production  
104         shyam    sales
105         ami    marketing
106         priti    marketing
107         atuul    sales
108         richa    production
109         laxman    production
110         ram     production
We can remove blank lines using awk: $ awk NF < myfile.
Using the sed Command Removing the first line from an input file using the sed command is pretty straightforward. The sed command in the example above isn't hard to understand. The parameter '1d' tells the sed command to apply the 'd' (delete) action on line number '1'.
Option firstrow is used to skip the first row in the CSV file that represents header in this case.
awk to print the first column. The first column of any file can be printed by using $1 variable in awk. But if the value of the first column contains multiple words then only the first word of the first column prints. By using a specific delimiter, the first column can be printed properly.
Use GNU printf for proper tab-spaced formatting
awk 'NR>1 {count[$3]++} END {for (dept in count) {printf "%-15s%-15s\n", dept, count[dept]}}' file
You can use printf with width options as below example if printf "%3s"
3: meaning output will be padded to 3 characters. From man awk, you can see more details:
width   The field should be padded to this width. The field is normally padded
        with spaces. If the 0  flag  has  been  used, it is padded with zeroes.
.prec   A number that specifies the precision to use when printing.  For the %e,
        %E, %f and %F, formats, this specifies the number of digits you want
        printed to the right of the decimal point. For the %g, and %G formats,
        it specifies the maximum number of significant  digits. For the %d, %o,
        %i, %u, %x, and %X formats, it specifies the minimum number of digits to
        print. For %s, it specifies the maximum number of characters from the
        string that should be printed.
You can add the padding count as you need. For the input file you specified
$ awk 'NR>1 {count[$3]++} END {for (dept in count) {printf "%-15s%-15s\n", dept, count[dept]}}' file
production     4
marketing      3
sales          3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With