I have a file:
To jest długi string z wieloma polskimi literami ąółżęś kodowany w UTF8,
żeby
było śmieszniej, haha.
ą
a
Example gawk:
gawk '{printf "%-80s %-s\n", $0, length}' file
In gawk, I get the correct result:
To jest długi string z wieloma polskimi literami ąółżęś kodowany w UTF8, 73
żeby 5
było śmieszniej, haha. 22
ą 1
a 1
In gawk, I get the correct result:
Example mawk:
mawk '{printf "%-80s %-s\n", $0, length}' file
To jest długi string z wieloma polskimi literami ąółżęś kodowany w UTF8, 80
żeby 6
było śmieszniej, haha. 24
ą 2
a 1
In mawk, I get the incorrect result:
As mawk get the same result as gawk?
mawk is a minimal-featured awk designed for speed of execution over functionality. You should not expect it to behave exactly the same as gawk or a POSIX awk. If you're going to use mawk, you need to get a mawk manual describing how IT behaves, don't rely on any other documentation describing how other awks behave.
IMHO there is no correct result for the formatting string %-s
as it is meaningless to align a string without specifying a width within which to align it. There's also different interpretations of what length
means on it's own - it could be short-hand for length($0)
or it could be something else in a non-POSIX awk, there might not even be a length function in some non-POSIX awk and so it might take that as an undefined variable name. How does any given awk handle non-English characters?
As I said - if you're going to use a non-POSIX awk, you need to check the manual for THAT awk for all of the gory details...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With