I'm being passed data that is ebcdic encoded. Something like:
s = u'@@@@@@@@@@@@@@@@@@@ÂÖÉâÅ@ÉÄ'
Attempting to .decode('cp500')
is wrong, but what's the correct approach? If I copy the string into something like Notepad++ I can convert it from EBCDIC to ascii, but I can't seem to find a viable approach in python to achieve the same. For what it's worth, the correct result is: BOISE ID
(plus or minus space padding).
The information is being retrieved from a file of lines of JSON objects. That file looks like this:
{ "command": "flush-text", "text": "@@@@@O@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@O" }
{ "command": "flush-text", "text": "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\u00C9\u00C4@\u00D5\u00A4\u0094\u0082\u0085\u0099z@@@@@@@@@@\u00D9\u00F5\u00F9\u00F7\u00F6\u00F8\u00F7\u00F2\u00F4" }
{ "command": "flush-text", "text": "@@@@@OmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmO" }
{ "command": "flush-text", "text": "@@@@@O@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@O" }
And the processing loop looks something like:
with open('myfile.txt', 'rb') as fh:
for line in fh:
data = json.loads(line)
How to decrypt EBCDIC cipher? By using the ASCII- EBCDIC equivalent table, any message can be decrypted. Example: 196,195,214,196,197 in EBCDIC becomes 68,67,79,68,69 in ASCII, which corresponds to the letters' DCODE'.
Select the Text Conversion tab. Select the option Allow file text conversion. Type * (an asterisk) in the File extensions for automatic EBCDIC/ASCII text conversion: input area and click on the Add button. Click on the OK button to save the changes.
EBCDIC, in full extended binary-coded decimal interchange code, data-encoding system, developed by IBM and used mostly on its computers, that uses a unique eight-bit binary code for each number and alphabetic character as well as punctuation marks and accented letters and nonalphabetic characters.
If Notepad++ converts it ok, then you should simply need:
Python 2.7:
with io.open('myfile.txt', 'r', encoding="cp500") as fh:
for line in fh:
data = json.loads(line)
Python 3.x:
with open('myfile.txt', 'r', encoding="cp500") as fh:
for line in fh:
data = json.loads(line)
This uses a TextWrapper to decode the file as it's read using the given decoding. io
module provides Python 3 open
to Python 2.x, with codecs/TextWrapper and universal newline support
My guess is that you need the value of the corresponding Unicode ordinals as bytes, and then decode that with cp500.
>>> s = u'@@@@@@@@@@@@@@@@@@@ÂÖÉâÅ@ÉÄ'
>>> bytearray(ord(c) for c in s).decode('cp500')
u' BOISE ID'
Alternatively:
>>> s.encode('latin-1').decode('cp500')
u' BOISE ID'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With