Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Turkish character encoding

I try to create new sentence from different list items. Its giving error when I print it by unicode. I can print it normally (without unicode). When I try to post it to the web site its rasing same error. I tought that if I can fix it with unicode, it will work when ı post it to the website.

p=['Bu', 'Şu']
k=['yazı','makale']
t=['hoş','ilgiç']
connect='%s %s %s'%(p[randint(0,len(p)-1)],k[randint(0,len(k)-1)],t[randint(0,len(t)-1)])
print unicode(connect)

And the output is :
Error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128)
like image 654
Alkindus Avatar asked Nov 11 '14 09:11

Alkindus


People also ask

What is Turkish character encoding?

It is designated ECMA-128 by Ecma International and TS 5881 as a Turkish standard. It is informally referred to as Latin-5 or Turkish. It was designed to cover the Turkish language, designed as being of more use than the ISO/IEC 8859-3 encoding.

Are Turkish characters UTF-8?

UTF8 does not work for turkish characters.

What character set does Turkish use?

Turkish computers may use character set ISO 8859-9 ("Latin 5"), which is identical to Latin 1 except that the rarely-used Icelandic characters "eth", "thorn", and "y with acute accent" are replaced with the needed Turkish characters.

What is the difference between ISO 8859 1 and UTF-8?

UTF-8 is a multibyte encoding that can represent any Unicode character. ISO 8859-1 is a single-byte encoding that can represent the first 256 Unicode characters. Both encode ASCII exactly the same way.


1 Answers

First of all you should put at the top of your script # -*- coding: utf-8 -*- to be able to use non-ascii characters in your script. Also while printing decode str to unicode will solve your problem.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from random import randint

p=['Bu', 'şu']
k=['yazı','makale']
t=['hoş','ilginç']
connect='%s %s %s'%(p[randint(0,len(p)-1)],k[randint(0,len(k)-1)],t[randint(0,len(t)-1)])
print connect.decode('utf-8')
like image 86
Burak Yılmaztürk Avatar answered Sep 28 '22 01:09

Burak Yılmaztürk