Python utf-8, howto align printout

Tags:

I have a array containing japanese caracters as well as "normal". How do I align the printout of these?

#!/usr/bin/python
# coding=utf-8

a1=['する', 'します', 'trazan', 'した', 'しました']
a2=['dipsy', 'laa-laa', 'banarne', 'po', 'tinky winky']

for i,j in zip(a1,a2):
    print i.ljust(12),':',j

print '-'*8

for i,j in zip(a1,a2):
    print i,len(i)
    print j,len(j)

Output:

する       : dipsy
します    : laa-laa
trazan       : banarne
した       : po
しました : tinky winky
--------
する 6
dipsy 5
します 9
laa-laa 7
trazan 6
banarne 7
した 6
po 2
しました 12
tinky winky 11

thanks, //Fredrik

980

asked Mar 19 '10 11:03

Fredrik Pihl

3 Answers

Using the unicodedata.east_asian_width function, keep track of which characters are narrow and wide when computing the length of the string.

#!/usr/bin/python
# coding=utf-8

import sys
import codecs
import unicodedata

out = codecs.getwriter('utf-8')(sys.stdout)

def width(string):
    return sum(1+(unicodedata.east_asian_width(c) in "WF")
        for c in string)

a1=[u'する', u'します', u'trazan', u'した', u'しました']
a2=[u'dipsy', u'laa-laa', u'banarne', u'po', u'tinky winky']

for i,j in zip(a1,a2):
    out.write('%s %s: %s\n' % (i, ' '*(12-width(i)), j))

Outputs:

する          : dipsy
します        : laa-laa
trazan        : banarne
した          : po
しました      : tinky winky

It doesn’t look right in some web browser fonts, but in a terminal window they line up properly.

180

answered Oct 02 '22 19:10

Josh Lee

Use unicode objects instead of byte strings:

#!/usr/bin/python
# coding=utf-8

a1=[u'する', u'します', u'trazan', u'した', u'しました']
a2=[u'dipsy', u'laa-laa', u'banarne', u'po', u'tinky winky']

for i,j in zip(a1,a2):
    print i.ljust(12),':',j

print '-'*8

for i,j in zip(a1,a2):
    print i,len(i)
    print j,len(j)

Unicode objects deal with characters directly.

answered Oct 02 '22 21:10

jcdyer

You need to manually build the string and also manually build the format length. There is no easy way for this

The three functions below do this (needs unicodedata):

shortenStringCJK: correctly shorten to a length for fitting in some output (not length cut for getting X characters)

def shortenStringCJK(string, width, placeholder='..'):
# get the length with double byte charactes
string_len_cjk = stringLenCJK(str(string))
# if double byte width is too big
if string_len_cjk > width:
    # set current length and output string
    cur_len = 0
    out_string = ''
    # loop through each character
    for char in str(string):
        # set the current length if we add the character
        cur_len += 2 if unicodedata.east_asian_width(char) in "WF" else 1
        # if the new length is smaller than the output length to shorten too add the char
        if cur_len <= (width - len(placeholder)):
            out_string += char
    # return string with new width and placeholder
    return "{}{}".format(out_string, placeholder)
else:
    return str(string)

stringLenCJK: get correct length (as in space taken on a terminal)

def stringLenCJK(string):
    # return string len including double count for double width characters
    return sum(1 + (unicodedata.east_asian_width(c) in "WF") for c in string)

formatLen: format the length to adjust for width from double byte characters. without this one the length will be unbalanced.

def formatLen(string, length):
    # returns length udpated for string with double byte characters
    # get string length normal, get string length including double byte characters
    # then subtract that from the original length
    return length - (stringLenCJK(string) - len(string))

to then output some string: pre define the format string

format_str = "|{{:<{len}}}|"
format_len = 26
string_len = 26

and output as follows (where _string is the string to output)

print("Normal : {}".format(
    format_str.format(
        len=formatLen(shortenStringCJK(_string, width=string_len), format_len))
    ).format(
        shortenStringCJK(_string, width=string_len)
    )
)

answered Oct 02 '22 19:10

Clemens Schwaighofer

Related questions
                            
                                Impact of removing a list item on reversed() in python
                            
                                Tuple with multiple numbers of arbitrary but equal type
                            
                                Why doesn't NaN raise any errors in python?
                            
                                Why does starred assignment produce lists and not tuples?
                            
                                Set permissions on a compressed file in python
                            
                                Good python library for designing a mmo? Actor based design [closed]
                            
                                Idiomatic asynchronous design
                            
                                Minimal, Standalone, Distributable, cross platform web server
                            
                                Passing a multi-line string as an argument to a script in Windows
                            
                                Is the pickling process deterministic?
                            
                                Python: Importing pydoc and then using it natively?
                            
                                How to alphabetically sort the values in a many-to-many django-admin box?
                            
                                How does the Jinja2 "recursive" tag actually work?
                            
                                Jython 2.5.1: "ImportError: No Module named os"
                            
                                Color Themes For Eclipse Python Development
                            
                                How to avoid multiple instances of a program?
                            
                                pygtk how to embed external application within my pygtk GUI
                            
                                Python equivalent of IDL's stop and .reset
                            
                                Building an Inference Engine in Python
                            
                                Getting the name which is not defined from NameError in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python utf-8, howto align printout

Tags:

python

unicode

utf-8