Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python opens text file with a space between every character

Tags:

Whenever I try to open a .csv file with the python command fread = open('input.csv', 'r') it always opens the file with spaces between every single character. I'm guessing it's something wrong with the text file because I can open other text files with the same command and they are loaded correctly. Does anyone know why a text file would load like this in python?

Thanks.

Update

Ok, I got it with the help of Jarret Hardie's post

this is the code that I used to convert the file to ascii

fread = open('input.csv', 'rb').read() mytext = fread.decode('utf-16') mytext = mytext.encode('ascii', 'ignore') fwrite = open('input-ascii.csv', 'wb') fwrite.write(mytext) 

Thanks!

like image 497
wlindner Avatar asked Mar 02 '09 17:03

wlindner


People also ask

How do I stop python from reading spaces?

strip() Python String strip() function will remove leading and trailing whitespaces. If you want to remove only leading or trailing spaces, use lstrip() or rstrip() function instead.

How do you strip a text file in Python?

Use str. rstrip or str. lstrip to strip space from right or left end only.

How do you put a space in a text file in Python?

We add space in string in python by using rjust(), ljust(), center() method. To add space between variables in python we can use print() and list the variables separate them by using a comma or by using the format() function.

How do you put a space in a text file?

Press the "Enter" or "Return" key on your computer keyboard to insert a space between the lines or blocks of text. You can insert as many paragraph spaces as you want by pressing the key more than once.


2 Answers

The post by recursive is probably right... the contents of the file are likely encoded with a multi-byte charset. If this is, in fact, the case you can likely read the file in python itself without having to convert it first outside of python.

Try something like:

fread = open('input.csv', 'rb').read() mytext = fread.decode('utf-16') 

The 'b' flag ensures the file is read as binary data. You'll need to know (or guess) the original encoding... in this example, I've used utf-16, but YMMV. This will convert the file to unicode. If you truly have a file with multi-byte chars, I don't recommend converting it to ascii as you may end up losing a lot of the characters in the process.

EDIT: Thanks for uploading the file. There are two bytes at the front of the file which indicates that it does, indeed, use a wide charset. If you're curious, open the file in a hex editor as some have suggested... you'll see something in the text version like 'I.D.|.' (etc). The dot is the extra byte for each char.

The code snippet above seems to work on my machine with that file.

like image 60
Jarret Hardie Avatar answered Oct 21 '22 06:10

Jarret Hardie


The file is encoded in some unicode encoding, but you are reading it as ascii. Try to convert the file to ascii before using it in python.

like image 33
recursive Avatar answered Oct 21 '22 05:10

recursive