Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write unicode strings into a file? [duplicate]

Tags:

python

unicode

I am using python 2.6.5 I want to write some japanese characters to a file. I am getting this error & I don't know how to change the encoding.

Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01) [GCC 4.3.4 20090804 (release) 1] on cygwin >>> s = u'\u5E73\u621015' >>> with open("yop", "wb") as f: ...   f.write( s + "\n" ); ... Traceback (most recent call last):   File "<stdin>", line 2, in <module> UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1:    ordinal not in range(128) >>> type( s ) <type 'unicode'> 
like image 291
Frankie Ribery Avatar asked Mar 30 '11 07:03

Frankie Ribery


People also ask

How do I write Unicode text to a text file in Python?

To write Unicode text to a text file with Python, we can call the file handle's write method with a Unicode encoded string. We define the string foo with a Unicode string. Then we open the test file with open with write permission. Next, we call f.

Is Unicode the same as string?

Unicode is a standard encoding system that is used to represent characters from almost all languages. Every Unicode character is encoded using a unique integer code point between 0 and 0x10FFFF . A Unicode string is a sequence of zero or more code points.

What is Unicode escape encoding?

A unicode escape sequence is a backslash followed by the letter 'u' followed by four hexadecimal digits (0-9a-fA-F). It matches a character in the target sequence with the value specified by the four digits. For example, ”\u0041“ matches the target sequence ”A“ when the ASCII character encoding is used.


2 Answers

you're going to have to 'encode' the unicode string.

s = u'\u5E73\u621015' with open("yop", "wb") as f:    f.write(s.encode("UTF-8")) 

try this out for a bit of a friendly look at unicode and python: http://farmdev.com/talks/unicode/

like image 159
Mike Ramirez Avatar answered Sep 22 '22 06:09

Mike Ramirez


As an alternative, you can use the codecs module:

import codecs s = u'\u5E73\u621015' with codecs.open("yop", "w", encoding="utf-8") as f:     f.write(s) 
like image 45
Philipp Avatar answered Sep 24 '22 06:09

Philipp