I am writing a function in which I am replacing the leading/trailing space from the column and if there is no value in the column replace it with null. Function is working fine for one column but how can i modify it for multiple columns.
Function :
#cat trimfunction
#!/bin/bash
function trim
{
vCol=$1 ###input column name
vFile=$2 ###input file name
var3=/home/vipin/temp ###temp file
awk -v col="${vCol}" -f /home/vipin/colf.awk ${vFile} > $var3 ###operation
mv -f $var3 $vFile ###Forcefully mv
}
AWK script :
#cat colf.awk
#!/bin/awk -f
BEGIN{FS=OFS="|"}{
gsub(/^[ \t]+|[ \t]+$/, "", $col) ###replace space from 2nd column
}
{if ($col=="") {print $1,"NULL",$3} else print $0} ###replace whitespace with NULL
Input file : leading/trailing/white space in 2nd column
#cat filename.txt
1| 2016-01|00000321|12
2|2016-02 |000000432|13
3|2017-03 |000004312|54
4| |000005|32
5|2017-05|00000543|12
Script :
#cat script.sh
. /home/vipin/trimfunction
trim 2 filename.txt
Output file : leading/trailing/white space removed in 2nd column
#./script.sh
#cat filename.txt
1|2016-01|00000321|12
2|2016-02|000000432|13
3|2017-03|000004312|54
4|NULL|000005
5|2017-05|00000543|12
If input file is like below - ( white/leading/trailing space in 2nd and 5th column of file)
1|2016-01|00000321|12|2016-01 |00000
2|2016-02 |000000432|13| 2016-01|00000
3| 2017-03|000004312|54| |00000
4| |000005|2016-02|0000
5|2017-05 |00000543|12|2016-02 |0000
How to achive below output - (All leading/trailing space trimmed and white space replaced with NULL in 2nd and 5th col) something like trim 2 5 filename.txt trim 2 5 filename.txt ###passing two column name as input
1|2016-01|00000321|12|2016-01|00000
2|2016-02|000000432|13|2016-01|00000
3|2017-03|000004312|54|NULL|00000
4|NULL|000005|2016-02|0000
5|2017-05|00000543|12|2016-02|0000
This will do what you said you wanted:
$ cat tst.sh
file="${!#}"
cols=( "$@" )
unset cols[$(( $# - 1 ))]
awk -v cols="${cols[*]}" '
BEGIN {
split(cols,c)
FS=OFS="|"
}
{
for (i in c) {
gsub(/^[[:space:]]+|[[:space:]]+$/,"",$(c[i]))
sub(/^$/,"NULL",$(c[i]))
}
print
}' "$file"
$ ./tst.sh 2 5 file
1|2016-01|00000321|12|2016-01|00000
2|2016-02|000000432|13|2016-01|00000
3|2017-03|000004312|54|NULL|00000
4|NULL|000005|2016-02|0000
5|2017-05|00000543|12|2016-02|0000
but if what you REALLY wanted was to operate on ALL fields instead of specific ones then of course there's a simpler solution.
Never do cmd file > tmp; mv tmp file
by the way, always do cmd file > tmp && mv tmp file
instead (note the &&
) so you only overwrite your original file if the command succeeded. Also - always quote your shell variables unless you have a very specific purpose in mind by not doing so and fully understand all of the implications, so use "$file"
, not $file
. Google it.
You can pass a list of columns to modify as a parameter. Create files
$ cat trim.awk
BEGIN {
split(c, a)
FS = OFS = "|"
}
{
for (i in a) {
i = a[i]
gsub(/^[ \t]+|[ \t]+$/, "", $i)
if (!length($i)) $i = "NULL"
}
print
}
and
$ cat filename.txt
1|2016-01|00000321|12|2016-01 |00000
2|2016-02 |000000432|13| 2016-01|00000
3| 2017-03|000004312|54| |00000
4| |000005|2016-02|0000
5|2017-05 |00000543|12|2016-02 |0000
Usage:
awk -v c="2 5" -f trim.awk filename.txt
If managing leading/trailing spaces is all you want to do, you probably don't want to do all(AWK code) that.
cat q1.txt | tr -s ' ' | sed 's/|\ |/|NULL|/g' | sed 's/\ //g'
should do.
Break-down tr -s ' '
: Squeeze multiple spaces into onesed 's/|\ |/|NULL|/g'
: Replace all "| |" with "|NULL|"sed 's/\ //g'
: Replace all spaces with empty string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With