I am beginner to awk. I have created one file which contains employee information. There are employees in different departments. And i wanna count that how many employees in each department. like
marketing 3
sales 3
production 4
For that i used following command.
awk 'NR>1 {dept=$5} {count[dept]++} END {for (dept in count) {print dept count[dept]}}' emp
But above code it count and displays the first line i.e header also. like
marketing 3
sales 3
department 1
production 4
where department is a header of column which is also counted although i used NR>1.. And how to add space or increase the width of all columns.. because it looks like above output.. but i wanna display it properly.. So any solution for this?
Here is my input file
empid empname department
101 ayush sales
102 nidhi marketing
103 priyanka production
104 shyam sales
105 ami marketing
106 priti marketing
107 atuul sales
108 richa production
109 laxman production
110 ram production
We can remove blank lines using awk: $ awk NF < myfile.
Using the sed Command Removing the first line from an input file using the sed command is pretty straightforward. The sed command in the example above isn't hard to understand. The parameter '1d' tells the sed command to apply the 'd' (delete) action on line number '1'.
Option firstrow is used to skip the first row in the CSV file that represents header in this case.
awk to print the first column. The first column of any file can be printed by using $1 variable in awk. But if the value of the first column contains multiple words then only the first word of the first column prints. By using a specific delimiter, the first column can be printed properly.
Use GNU printf for proper tab-spaced formatting
awk 'NR>1 {count[$3]++} END {for (dept in count) {printf "%-15s%-15s\n", dept, count[dept]}}' file
You can use printf
with width
options as below example if printf "%3s"
3
: meaning output will be padded to 3 characters. From man awk
, you can see more details:
width The field should be padded to this width. The field is normally padded
with spaces. If the 0 flag has been used, it is padded with zeroes.
.prec A number that specifies the precision to use when printing. For the %e,
%E, %f and %F, formats, this specifies the number of digits you want
printed to the right of the decimal point. For the %g, and %G formats,
it specifies the maximum number of significant digits. For the %d, %o,
%i, %u, %x, and %X formats, it specifies the minimum number of digits to
print. For %s, it specifies the maximum number of characters from the
string that should be printed.
You can add the padding count as you need. For the input file you specified
$ awk 'NR>1 {count[$3]++} END {for (dept in count) {printf "%-15s%-15s\n", dept, count[dept]}}' file
production 4
marketing 3
sales 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With