I am formatting an input file, here is a sample of it:
The First 1,000,000 Primes
2 3 5 7 11 13 17 19
23 29 31 37 41 43 47 53
And this file, with the xxd command:
00000000: 2020 2020 2020 2020 2020 2020 2020 2020
00000010: 2054 6865 2046 6972 7374 2031 2c30 3030 The First 1,000
00000020: 2c30 3030 2050 7269 6d65 7320 2866 726f ,000 Primes (fro
00000030: 6d20 7072 696d 6573 2e75 746d 2e65 6475 m primes.utm.edu
00000040: 290d 0d0a 0d0d 0a20 2020 2020 2020 2020 )......
00000050: 3220 2020 2020 2020 2020 3320 2020 2020 2 3
00000060: 2020 2020 3520 2020 2020 2020 2020 3720 5 7
00000070: 2020 2020 2020 2031 3120 2020 2020 2020 11
00000080: 2031 3320 2020 2020 2020 2031 3720 2020 13 17
00000090: 2020 2020 2031 3920 0d0d 0a20 2020 2020 19 ...
I am using the command:
awk -v OFS="\n" '{$1=$1}1' inputfile
But I get empty lines that I would like to get rid of:
The
First
1,000,000
Primes
2
3
5
I can get rid of it with a pipe:
awk -v OFS="\n" '{$1=$1}1' inputfile | awk '!/\r/'
Is there a way to achieve the same by calling Awk only once?
Assumptions/Understandings:
\r charactersReconstituting OP's file from the xxd output:
$ xxd -r <<< '00000000: 2020 2020 2020 2020 2020 2020 2020 2020
00000010: 2054 6865 2046 6972 7374 2031 2c30 3030 The First 1,000
00000020: 2c30 3030 2050 7269 6d65 7320 2866 726f ,000 Primes (fro
00000030: 6d20 7072 696d 6573 2e75 746d 2e65 6475 m primes.utm.edu
00000040: 290d 0d0a 0d0d 0a20 2020 2020 2020 2020 )......
00000050: 3220 2020 2020 2020 2020 3320 2020 2020 2 3
00000060: 2020 2020 3520 2020 2020 2020 2020 3720 5 7
00000070: 2020 2020 2020 2031 3120 2020 2020 2020 11
00000080: 2031 3320 2020 2020 2020 2031 3720 2020 13 17
00000090: 2020 2020 2031 3920 0d0d 0a20 2020 2020 19 ...' > inputfile
Tweaking OP's current awk script:
awk -v OFS="\n" '
NR==1 { next } # skip 1st line
{ gsub(/\r/,"" ) # strip all "\r" characters from line
$1=$1
}
length() # if resulting line length > 0 then print to stdout
' inputfile
#### as a one-liner:
awk -v OFS="\n" 'NR==1{next}{gsub(/\r/,""); $1=$1} length()' inputfile
An alternative awk idea:
awk 'NR>1 { for (i=1;i<=NF;i++) # skip 1st line and loop through all fields
if ($i+0==$i) # if the field is a number (prime in this case) then ...
print $i # print on its own line
}
' inputfile
#### as a one-liner:
awk 'NR>1 {for (i=1;i<=NF;i++) if ($i+0==$i) print $i}' inputfile
These both generates:
2
3
5
7
11
13
17
19
Without expected output it's a guess but is this what you're trying to do?
$ awk -v OFS='\n' '/^[[:space:]0-9]+$/{$1=$1; print}' file
2
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
or maybe this?
$ awk '/^[[:space:]0-9]+$/{$1=$1; printf "%s%s", sep, $0; sep=OFS} END{print ""}' file
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53
They'll both behave the same way with any POSIX awk.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With