Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3 Decoding Strings

I understand that this is likely a repeat question, but I'm having trouble finding a solution.

In short I have a string I'd like to decode:

raw = "\x94my quote\x94"
string = decode(raw)

expected from string

'"my quote"'

Last point of note is that I'm working with Python 3 so raw is unicode, and thus is already decoded. Given that, what exactly do I need to do to "decode" the "\x94" characters?

like image 546
rmorshea Avatar asked Jun 01 '17 05:06

rmorshea


2 Answers

string = "\x22my quote\x22"
print(string)

You don't need to decode, Python 3 does that for you, but you need the correct control character for the double quote "

If however you have a different character set, it appears you have Windows-1252, then you need to decode the byte string from that character set:

str(b"\x94my quote\x94", "windows-1252")

If your string isn't a byte string you have to encode it first, I found the latin-1 encoding to work:

string = "\x94my quote\x94"
str(string.encode("latin-1"), "windows-1252")
like image 78
CodeMonkey Avatar answered Sep 18 '22 02:09

CodeMonkey


I don't know if you mean to this, but this works:

some_binary = a = b"\x94my quote\x94"
result = some_binary.decode()

And you got the result... If you don't know which encoding to choose, you can use chardet.detect:

import chardet
chardet.detect(some_binary)
like image 39
Yuval Pruss Avatar answered Sep 19 '22 02:09

Yuval Pruss