Character reading from file in Python

Tags:

In a text file, there is a string "I don't like this".

However, when I read it into a string, it becomes "I don\xe2\x80\x98t like this". I understand that \u2018 is the unicode representation of "'". I use

f1 = open (file1, "r")
text = f1.read()

command to do the reading.

Now, is it possible to read the string in such a way that when it is read into the string, it is "I don't like this", instead of "I don\xe2\x80\x98t like this like this"?

Second edit: I have seen some people use mapping to solve this problem, but really, is there no built-in conversion that does this kind of ANSI to unicode ( and vice versa) conversion?

924

asked Oct 08 '22 23:10

Graviton

1 Answers

Ref: http://docs.python.org/howto/unicode

Reading Unicode from a file is therefore simple:

import codecs
with codecs.open('unicode.rst', encoding='utf-8') as f:
    for line in f:
        print repr(line)

It's also possible to open files in update mode, allowing both reading and writing:

with codecs.open('test', encoding='utf-8', mode='w+') as f:
    f.write(u'\u4500 blah blah blah\n')
    f.seek(0)
    print repr(f.readline()[:1])

EDIT: I'm assuming that your intended goal is just to be able to read the file properly into a string in Python. If you're trying to convert to an ASCII string from Unicode, then there's really no direct way to do so, since the Unicode characters won't necessarily exist in ASCII.

If you're trying to convert to an ASCII string, try one of the following:

Replace the specific unicode chars with ASCII equivalents, if you are only looking to handle a few special cases such as this particular example
Use the unicodedata module's normalize() and the string.encode() method to convert as best you can to the next closest ASCII equivalent (Ref https://web.archive.org/web/20090228203858/http://techxplorer.com/2006/07/18/converting-unicode-to-ascii-using-python):
```
>>> teststr
u'I don\xe2\x80\x98t like this'
>>> unicodedata.normalize('NFKD', teststr).encode('ascii', 'ignore')
'I donat like this'
```

answered Nov 13 '22 04:11

Jay

Related questions
                            
                                How to check task status in Celery?
                            
                                Connecting to Microsoft SQL server using Python
                            
                                Python way to clone a git repository
                            
                                Sort Pandas Dataframe by Date
                            
                                How should I write tests for Forms in Django?
                            
                                Reload Flask app when template file changes
                            
                                Compile (but do not run) a Python script [duplicate]
                            
                                Why does Python 3 allow "00" as a literal for 0 but not allow "01" as a literal for 1?
                            
                                What does pythonic mean? [closed]
                            
                                Difference between open and codecs.open in Python
                            
                                Pandas selecting by label sometimes return Series, sometimes returns DataFrame
                            
                                What is the proper way to determine if an object is a bytes-like object in Python?
                            
                                Compiling Python to WebAssembly
                            
                                Where in a virtualenv does the custom code go?
                            
                                random.choice from set? python
                            
                                Can modules have properties the same way that objects can?
                            
                                Compiling with cython and mingw produces gcc: error: unrecognized command line option '-mno-cygwin'
                            
                                Zip with list output instead of tuple
                            
                                asyncio.ensure_future vs. BaseEventLoop.create_task vs. simple coroutine?
                            
                                Passing an integer by reference in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Character reading from file in Python

Tags:

python

encoding

unicode

ascii

Graviton

People also ask

1 Answers

Jay

Recent Activity

Donate For Us