Python - "xor"ing each byte in "bytes" in the most efficient way

Q: What is the difference between bytes and Bytearray in Python?

Definition and Usage The difference between bytes() and bytearray() is that bytes() returns an object that cannot be modified, and bytearray() returns an object that can be modified.

Q: What is Bytearray?

A byte array is simply an area of memory containing a group of contiguous (side by side) bytes, such that it makes sense to talk about them in order: the first byte, the second byte etc..

Q: How do you convert int to byte in Python?

An int value can be converted into bytes by using the method int. to_bytes(). The method is invoked on an int value, is not supported by Python 2 (requires minimum Python3) for execution.

Tags:

python

I have a "bytes" object and an "int" mask, I want to do a xor over all the bytes with my mask. I do this action repeatedly over big "bytes" objects (~ 4096 KB).

This is the code I have which does the work well, only it is very CPU intensive and slows down my script:

# 'data' is bytes and 'mask' is int
bmask = struct.pack('!I', mask) # converting the "int" mask to "bytes" of 4 bytes 
a = bytes(b ^ m for b, m in zip(data, itertools.cycle(bmask)))

The best I could come up with is this, which is about 20 times faster:

# 'data' is bytes and 'mask' is int
# reversing the bytes of the mask
bmask = struct.pack("<I", mask)
mask = struct.unpack(">I", bmask)[0]

# converting from bytes to array of "int"s
arr = array.array("I", data)

# looping over the "int"s
for i in range(len(arr)):
    arr[i] ^= mask

# must return bytes
a = bytes(arr)

My questions are:

Is there a more efficient way to do this (CPU-wize)?
Is there a "cleaner" way to do this (without hurting performance)?

P.S. if it is of any importance, I'm using Python 3.5

592

asked Oct 03 '17 08:10

2 Answers

I don't think you can get much faster than your algorithm, using pure Python. (But Fabio Veronese's answer shows that's not true). You can shave off a tiny bit of time by doing the looping in a list comprehension, but then that list needs to be converted back into an array, and the array has to be converted to bytes, so it uses more RAM for a negligible benefit.

However, you can make this much faster by using Numpy. Here's a short demo.

from time import perf_counter
from random import randrange, seed
import array
import numpy as np

seed(42)

def timed(func):
    ''' Timing decorator '''
    def wrapped(*args):
        start = perf_counter()
        result = func(*args)
        stop = perf_counter()
        print('{}: {:.6f} seconds'.format(func.__name__, stop - start))
        return result
    wrapped.__name__ = func.__name__
    wrapped.__doc__ = func.__doc__
    return wrapped

@timed
def do_mask_arr1(data, mask):
    arr = array.array("I", data)
    # looping over the "int"s
    for i in range(len(arr)):
        arr[i] ^= mask
    return arr.tobytes()

@timed
def do_mask_arr2(data, mask):
    arr = array.array("I", data)
    return array.array("I", [u ^ mask for u in arr]).tobytes()

@timed
def do_mask_numpy(data, mask):
    return (np.fromstring(data, dtype=np.uint32) ^ mask).tobytes()

@timed
def make_data(datasize):
    ''' Make some random bytes '''
    return bytes(randrange(256) for _ in range(datasize))

datasize = 100000
mask = 0x12345678
data = make_data(datasize)

d1 = do_mask_arr1(data, mask)
d2 = do_mask_arr2(data, mask)
print(d1 == d2)

d3 = do_mask_numpy(data, mask)
print(d1 == d3)

typical output

make_data: 0.751557 seconds
do_mask_arr1: 0.026865 seconds
do_mask_arr2: 0.025110 seconds
True
do_mask_numpy: 0.000438 seconds
True

Tested using Python 3.6.0 on an old single core 32 bit 2GHz machine running on Linux.

I just did a run with datasize = 4000000 and do_mask_numpy took 0.0422 seconds.

answered Sep 22 '22 19:09

PM 2Ring

An alternative in case you don't want to use numpy. The advantage comes from making a single comparison, while extending the mask size to the needed (depending on the datasize).

@timed
def do_mask_int(data, mask):
    intdata = int.from_bytes(data, byteorder='little', signed=False)
    strmask = format(mask,'0x')
    strmask = strmask * ((intdata.bit_length() + 31) // 32)
    n = intdata ^ int(strmask, 16)
    return n.to_bytes(((n.bit_length() + 7) // 8), 'little') or b'\0'

results are as it follows:

make_data: 8.288754 seconds
do_mask_arr1: 0.258530 seconds
do_mask_arr2: 0.253095 seconds
True
do_mask_numpy: 0.010309 seconds
True
do_mask_int: 0.060408 seconds
True

Still credits to numpy for being faster, but maybe one doesn't want to include it in production environment.

:] Best

answered Sep 26 '22 19:09

Fabio Veronese

Related questions
                            
                                Django Rest Framework Cache Headers
                            
                                How can I pass a ctx (Context) to CliRunner?
                            
                                pip install require tls/ssl
                            
                                How to disable jupyter notebook history
                            
                                Filtering based on custom warning categories
                            
                                Sklearn Model (Python) with NodeJS (Express): how to connect both?
                            
                                PyPdf2 nested bookmarks with same name not working
                            
                                Clear cache or memory in Python
                            
                                XOR not learned using keras v2.0
                            
                                pytest failure in a separate thread
                            
                                Yielding from within with statement and __exit__ method of context manager
                            
                                Recorded audio of one note produces multiple onset times
                            
                                How to upgrade from Python 3.5 to 3.6?
                            
                                compute z-score with the function in scipy and numpy
                            
                                Why are 2 of the 6 built-in constants assignable?
                            
                                unhashable type when redirecting back to the website using python-social-auth in Django
                            
                                Python 3.5 vs. 3.6 what made "map" slower compared to comprehensions
                            
                                python-ldap add_s fails to add attribute for AD user with OBJECT_CLASS_VIOLATION
                            
                                'io.h': No such file or directory during "pip install netifaces"
                            
                                What's the correct way to compute a confusion matrix for object detection?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python - "xor"ing each byte in "bytes" in the most efficient way

Tags:

performance

python

Gil Barash

People also ask

2 Answers

PM 2Ring

Fabio Veronese

Recent Activity

Donate For Us