Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Padded printf format strings not adding enough padding with multi-byte characters

Tags:

bash

shell

printf

I often use printf inside shell scripts to make some nice aligned outputs

The problem is, everytime there is an accent (éèà) in the printed string, it shifts the following string 1 step back.

Example :

printf "%-10s %s\n" "toto" "test"
printf "%-10s %s\n" "titi" "test"
printf "%-10s %s\n" "tété" "test"
printf "%-10s %s\n" "toto" "test"

Expected :

toto       test
titi       test
tété       test
toto       test

Got :

toto       test
titi       test
tété     test
toto       test

Does someone have an explanation on this and what can I do to make printf doing it right with special characters?

Thank you for your help

like image 614
user14706816 Avatar asked Nov 25 '20 14:11

user14706816


3 Answers

Does someone have an explanation on this

é is character encoded with two bytes.

what can I do to make printf doing it right with special characters?

Design your own method of padding that would take into account utf-8s. Ideally I believe a tool like wprintf or making %Ls format specifier call wcwidth() to determine character width or something similar would be welcomed and usefull.

As of now at least my bash when calculating string length takes utf-8 chars into account. You could insert the padding yourself:

printf "%-10s %s\n" "titi" "test";
s="tété";
# (echo -n "$s" | wc -c) is 6 , but ${#s} is 4!
printf "%s%*s %s\n" "$s" "$((10-${#s}))" "" "test"
like image 148
KamilCuk Avatar answered Oct 18 '22 20:10

KamilCuk


Adapted my answer from https://unix.stackexchange.com/a/592479/310674

#!/usr/bin/env bash

align_left(){ printf %s%\*s "${2:0:$1}" $(($1-${#2})) '';}
 
printf '%s %s\n' \
  "$(align_left 10 "toto")" "test" \
  "$(align_left 10 "titi")" "test" \
  "$(align_left 10 "tété")" "test" \
  "$(align_left 10 "têtu")" "test"

Output:

toto       test
titi       test
tété       test
têtu       test
like image 4
Léa Gris Avatar answered Oct 18 '22 20:10

Léa Gris


But you can use other tool to print your report in that manner. Following example uses awk:

echo "toto" | awk '{printf "%-10s test\n", $1}'
echo "tété" | awk '{printf "%-10s test\n", $1}'
echo "titi" | awk '{printf "%-10s test\n", $1}'

EDIT:

The following statement was partially wrong: printf might not be part of bash, but coreutils. Coreutils have a long history with multibyte characters - https://crashcourse.housegordon.org/coreutils-multibyte-support.html.

As noted in a comment by @charles-duffy - printf, in this case, is shell builtin. You can check it with:

[Alex@NormandySR2 ~]$ type printf
printf is a shell builtin

I also agree with the fact that most shell implements their own printf. I checked the following:

  • fish
  • bash
  • zsh
  • tcsh
  • ksh
  • dash
  • oil

All of them uses printf builtin that can differ in details. So my assumption about printf as part of coreutils, in this case, was wrong.

like image 1
Alex Baranowski Avatar answered Oct 18 '22 19:10

Alex Baranowski