I got three UTF-8 stings:
hello, world
hello, 世界
hello, 世rld
I only want the first 10 ascii-char-width so that the bracket in one column:
[hello, wor]
[hello, 世 ]
[hello, 世r]
In console:
width('世界')==width('worl')
width('世 ')==width('wor') #a white space behind '世'
One chinese char is three bytes, but it only 2 ascii chars width when displayed in console:
>>> bytes("hello, 世界", encoding='utf-8')
b'hello, \xe4\xb8\x96\xe7\x95\x8c'
python's format()
doesn't help when UTF-8 chars mixed in
>>> for s in ['[{0:<{1}.{1}}]'.format(s, 10) for s in ['hello, world', 'hello, 世界', 'hello, 世rld']]:
... print(s)
...
[hello, wor]
[hello, 世界 ]
[hello, 世rl]
It's not pretty:
-----------Songs-----------
| 1: 蝴蝶 |
| 2: 心之城 |
| 3: 支持你的爱人 |
| 4: 根生的种子 |
| 5: 鸽子歌(CUCURRUCUCU PALO|
| 6: 林地之间 |
| 7: 蓝光 |
| 8: 在你眼里 |
| 9: 肖邦离别曲 |
| 10: 西行( 魔戒王者再临主题曲)(INTO |
| X 11: 深陷爱河 |
| X 12: 钟爱大地(THE MO RUN AIR |
| X 13: 时光流逝 |
| X 14: 卡农 |
| X 15: 舒伯特小夜曲(SERENADE) |
| X 16: 甜蜜的摇篮曲(Sweet Lullaby|
---------------------------
So, I wonder if there is a standard way to do the UTF-8 padding staff?
if you are working with English and Chinese characters, maybe this snippet can help you.
data = '''\
蝴蝶(A song)
心之城(Another song)
支持你的爱人(Yet another song)
根生的种子
鸽子歌(Cucurrucucu palo whatever)
林地之间
蓝光
在你眼里
肖邦离别曲
西行(魔戒王者再临主题曲)(Into something)
深陷爱河
钟爱大地
时光流逝
卡农
舒伯特小夜曲(SERENADE)
甜蜜的摇篮曲(Sweet Lullaby)'''
width = 80
def get_aligned_string(string,width):
string = "{:{width}}".format(string,width=width)
bts = bytes(string,'utf-8')
string = str(bts[0:width],encoding='utf-8',errors='backslashreplace')
new_width = len(string) + int((width - len(string))/2)
if new_width!=0:
string = '{:{width}}'.format(str(string),width=new_width)
return string
for i,line in enumerate(data.split('\n')):
song = get_aligned_string(line,width)
line = '|{:4}: {:}|'.format(i+1,song)
print(line)
There seems to be no official support for this, but a built-in package may help:
>>> import unicodedata
>>> print unicodedata.east_asian_width(u'中')
The returned value represents the category of the code point. Specifically,
This answer to a similar question provided a quick solution. Note however, the display result depends on the exact monospaced font used. The default fonts used by ipython and pydev don't work well, while windows console is ok.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With