This is for a modern cryptography class that I am currently taking.
The challenge is the cryptopals challenge 3: Single-Byte XOR Cipher, and I am trying to use python 3 to help complete this.
I know that I am supposed to XOR the string and converted to English. The hex string is "1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736" which converts to "806748453371902409051174291875458592743800337585421566549206796642836053682239286" in decimal form.
I have XOR'd this against multiple hex byte combinations (2 hex digits), but I do not know how to convert this into English. Is it just brute force and educated guessing at this point?
I know about ETAOIN SHRDLU, but this hasn't really been that helpful.
Thank you for your time and help.
ADDED: Additionally, I tried Challenge #4 but this code does not seem to work. But it did work for Challenge #3 so I am confused.
Challenge #3 Challenge #4
In a single byte XOR , the length of the key is one byte, so there can be only 255 possible keys (0x0 - 0xff) with the exception of 0 as the key because XORing any value with 0 will give the same value as result (that is, no encryption).
XOR Encryption is an encryption method used to encrypt data and is hard to crack by brute-force method, i.e generating random encryption keys to match with the correct one.
This operation is sometimes called modulus 2 addition (or subtraction, which is identical). With this logic, a string of text can be encrypted by applying the bitwise XOR operator to every character using a given key. To decrypt the output, merely reapplying the XOR function with the key will remove the cipher.
Simply put, XOR (pronounced “exclusive or”) cipher is an additive cypher. It is based on the XOR operation (also known as the exclusive disjunction) in logic. As a logical operation, XOR is also called modulus 2 addition. In XOR operation, the output is true when the inputs differ.
You can use binascii.hexlify
, binascii.unhexlify
to convert byte strings to hexadecimals or vice versa:
>>> import binascii
>>> binascii.hexlify(b'HELLO') # to Hex
b'48454c4c4f'
>>> binascii.unhexlify('48454c4c4f') # from Hex
b'HELLO'
Using str.isprintable
, you can filter out non-printable candidates:
>>> 'abcd'.isprintable()
True
>>> '\x00'.isprintable()
False
>>> '\x7f'.isprintable()
False
import binascii
encoded = binascii.unhexlify('1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736')
for xor_key in range(256):
decoded = ''.join(chr(b ^ xor_key) for b in encoded)
if decoded.isprintable():
print(xor_key, decoded)
Building on @falsetru's answer, but showing just the decoded string with the most space characters:
>>> encoded = '1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736'
>>> import binascii
>>> nums = binascii.unhexlify(encoded)
>>> strings = (''.join(chr(num ^ key) for num in nums) for key in range(256))
>>> max(strings, key=lambda s: s.count(' '))
"Cooking MC's like a pound of bacon"
Instead of counting spaces, you could use ETAOIN SHRDLU ("the approximate order of frequency of the 12 most commonly used letters in the English language") for weights, but it's not necessary here.
Btw, I think it would've been good if you had linked to the challenge.
Edit: Alternatively, you can try to find the key (or a few most promising keys) and then only decode using that key (or those few keys). For example, assuming that counting the spaces will determine the winner:
>>> encoded = '1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736'
>>> import binascii
>>> nums = binascii.unhexlify(encoded)
>>> key = max(nums, key=nums.count) ^ ord(' ')
>>> ''.join(chr(num ^ key) for num in nums)
"Cooking MC's like a pound of bacon"
This could even easily be done by hand (though the challenge tells you not to do that).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With