Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: concatenating bytes with a string

Tags:

python

string

md5

I'm working on a python project in 2.6 that also has future support for python 3 being worked in. Specifically I'm working on a digest-md5 algorithm.

In python 2.6 without running this import:

from __future__ import unicode_literals

I am able to write a piece of code such as this:

a1 = hashlib.md5("%s:%s:%s" % (self.username, self.domain, self.password)).digest() 
a1 = "%s:%s:%s" %(a1, challenge["nonce"], cnonce )

Without any issues, my authentication works fine. When I try the same line of code with the unicode_literals imported I get an exception:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xa8 in position 0: unexpected code byte

Now I'm relatively new to python so I'm a bit stuck in figuring this out. if I replace the %s in the formatting string as %r I am able to concatenate the string, but the authentication doesn't work. The digest-md5 spec that I had read says that the 16 octet binary digest must be appended to these other strings.

Any thoughts?

like image 306
Macdiesel Avatar asked Jul 01 '10 12:07

Macdiesel


People also ask

How do you concatenate a string and a byte in Python?

Python concatenate strings and bytes To concatenate strings and bytes we will use the + operator to concatenate, and also we use str() to convert the bytes to string type, and then it will be concatenated. To get the output, I have used print(my_str + str(bytes)).

How do you add bytes together in Python?

To join a list of Bytes, call the Byte. join(list) method. If you try to join a list of Bytes on a string delimiter, Python will throw a TypeError , so make sure to call it on a Byte object b' '. join(...)

Can you concatenate bytes?

The recommended solution to concatenate two or more byte arrays is using ByteArrayOutputStream . The idea is to write bytes from each of the byte arrays to the output stream, and then call toByteArray() to get the current contents of the output stream as a byte array.

Can you concatenate int and string in Python?

Python supports string concatenation using the + operator. In most other programming languages, if we concatenate a string with an integer (or any other primitive data types), the language takes care of converting them to a string and then concatenates it.


1 Answers

The reason for the behaviour you observed is that from __future__ import unicode_literals switches the way Python works with strings:

  • In the 2.x series, strings without the u prefix are treated as sequences of bytes, each of which may be in the range \x00-\xff (inclusive). Strings with the u prefix are ucs-2 encoded unicode sequences.
  • In Python 3.x -- as well as in the unicode_literals future, strings without the u prefix are unicode strings encoded in either UCS-2 or UCS-4 (depends on the compiler flag used when compiling Python). Strings with the b prefix are literals for the data type bytes which are rather similar to pre-3.x non-unicode strings.

In either version of Python, byte-strings and unicode-strings must be converted. The conversion performed by default depends on your system's default charset; in your case this is UTF-8. Without setting anything, it should be ascii, which rejects all characters above \x7f.

The message digest returned by hashlib.md5(...).digest() is a bytes-string, and I suppose you want the result of the whole operation to be a byte-string as well. If you want that, convert the nonce and cnonce-strings to byte-strings.:

a1 = hashlib.md5("%s:%s:%s"  % (self.username, self.domain, self.password)).digest()
# note that UTF-8 may not be the encoding required by your counterpart, please check
a1 = b"%s:%s:%s" %(a1, challenge["nonce"].encode("UTF-8"), cnonce.encode("UTF-8") )

Alternatively, you can convert the byte-string coming from the call to digest() to a unicode string (not recommended). As the lower 8 bit of UCS-2 are equivalent to ISO-8859-1, this might serve your needs:

a1 = hashlib.md5("%s:%s:%s"  % (self.username, self.domain, self.password)).digest()
a1 = "%s:%s:%s" %(a1.decode("ISO-8859-1"), challenge["nonce"], cnonce)
like image 89
nd. Avatar answered Oct 02 '22 16:10

nd.