I have some bytearray
with length of 2*n
:
a1 a2 b1 b2 c1 c2
I need to switch bytes endian in each 2-byte word, and make:
a2 a1 b2 b1 c2 c1
Now I use next approach but it is very slow for my task:
converted = bytearray([])
for i in range(int(len(chunk)/2)):
converted += bytearray([ chunk[i*2+1], chunk[i*2] ])
Is it possible to switch endian of bytearray
by calling some system/libc function?
Ok, thanks to all, I timed some suggestions:
import timeit
test = [
"""
converted = bytearray([])
for i in range(int(len(chunk)/2)):
converted += bytearray([ chunk[i*2+1], chunk[i*2] ])
""",
"""
for i in range(0, len(chunk), 2):
chunk[i], chunk[i+1] = chunk[i+1], chunk[i]
""",
"""
byteswapped = bytearray([0]) * len(chunk)
byteswapped[0::2] = chunk[1::2]
byteswapped[1::2] = chunk[0::2]
""",
"""
chunk[0::2], chunk[1::2] = chunk[1::2], chunk[0::2]
"""
]
for t in test:
print(timeit.timeit(t, setup='chunk = bytearray([1]*10)'))
and result is:
$ python ti.py
11.6219761372
2.61883187294
3.47194099426
1.66421198845
So in-pace slice assignment with a step of 2 now is fastest. Also thanks to Mr. F for detailed explaining but I not yet tried it because of numpy
You could use slice assignment with a step of 2:
byteswapped = bytearray(len(original))
byteswapped[0::2] = original[1::2]
byteswapped[1::2] = original[0::2]
Or if you want to do it in-place:
original[0::2], original[1::2] = original[1::2], original[0::2]
Timing shows that slicing massively outperforms a Python-level loop for large arrays:
>>> timeit.timeit('''
... for i in range(0, len(chunk), 2):
... chunk[i], chunk[i+1] = chunk[i+1], chunk[i]''',
... 'chunk=bytearray(1000)')
81.70195105159564
>>>
>>> timeit.timeit('''
... byteswapped = bytearray(len(original))
... byteswapped[0::2] = original[1::2]
... byteswapped[1::2] = original[0::2]''',
... 'original=bytearray(1000)')
2.1136113323948393
>>>
>>> timeit.timeit('chunk[0::2], chunk[1::2] = chunk[1::2], chunk[0::2]', 'chunk=
bytearray(1000)')
1.79349659994989
For small arrays, slicing still beats the explicit loop, but the difference isn't as big:
>>> timeit.timeit('''
... for i in range(0, len(chunk), 2):
... chunk[i], chunk[i+1] = chunk[i+1], chunk[i]''',
... 'chunk=bytearray(10)')
1.2503637694328518
>>>
>>> timeit.timeit('''
... byteswapped = bytearray(len(original))
... byteswapped[0::2] = original[1::2]
... byteswapped[1::2] = original[0::2]''',
... 'original=bytearray(10)')
0.8973060929306484
>>>
>>> timeit.timeit('chunk[0::2], chunk[1::2] = chunk[1::2], chunk[0::2]', 'chunk=
bytearray(10)')
0.6282232971918802
If you use numpy's frombuffer
function, you can construct a numpy ndarray
that actually shares the physical memory of the bytearray
, and then swapping actions could be done in-place rather than in a copy.
You can simply perform the same indexing directly on byt
, such as
byt[i] = byt[i+1]
but getting a handle to the buffered array with an explicit type in numpy often lets you do much more, particularly with bytearray
which is very limited on its own.
Be careful though. Even though at the Python level, bytearray
represents unsigned 8-bit integer values (0-255), the actual underlying C implementation, in bytearrayobject.h
uses a plain char
for the byte values (see here for more info). If using numpy for this, you probably want to specify the optional dtype
argument to frombuffer
, with dtype=numpy.ubyte
.
import numpy as np
byt = bytearray(range(256))
npbyt = np.frombuffer(byt, dtype=np.ubyte)
for i in range(0, len(npbyt)-1, 2):
temp = npbyt[i]
npbyt[i] = npbyt[i+1]
npbyt[i+1] = temp
print(list(byt))
It prints
[1,
0,
3,
2,
5,
4,
...
255,
254]
I ran into some of these issues while working on a small side project called buffersort, which can perform in-place sorting on Python objects that implement the write able buffer protocol, as bytearray
does.
If you're interested in grabbing the Cython source from there, there is a simple helper function _swap
that would make it easy to do what you want. It's probably highly overkill for your use case though.
From some answers in How to byte-swap a 32-bit integer in python?
import struct
def htonl_slice(data):
byteswapped = bytearray(4)
byteswapped[0::4] = data[3::4]
byteswapped[1::4] = data[2::4]
byteswapped[2::4] = data[1::4]
byteswapped[3::4] = data[0::4]
return byteswapped
def htonl_slice_2(data):
byteswapped = bytearray(len(data))
byteswapped[0::4] = data[3::4]
byteswapped[1::4] = data[2::4]
byteswapped[2::4] = data[1::4]
byteswapped[3::4] = data[0::4]
return byteswapped
def htonl_struct(data):
return struct.pack(">L", struct.unpack("<L", data)[0])
def swap32(data):
return [struct.unpack("<I", struct.pack(">I", i))[0] for i in data]
def htonl_shift(x):
return (((x << 24) & 0xFF000000) |
((x << 8) & 0x00FF0000) |
((x >> 8) & 0x0000FF00) |
((x >> 24) & 0x000000FF))
def test(self):
data = [struct.pack('<L', i) for i in range(1000000)]
start = time.time()
for d in data:
x = htonl_slice(d)
end = time.time()
print("htonl_slice %f" % (end - start))
start = time.time()
for d in data:
x = htonl_struct(d)
end = time.time()
print("htonl_struct %f" % (end - start))
data = [i for i in range(1000000)]
start = time.time()
for d in data:
x = htonl_shift(d)
end = time.time()
print("htonl_shift %f" % (end - start))
start = time.time()
x = swap32(data)
end = time.time()
print("swap32 %f" % (end - start))
data = bytearray()
for i in range(1000000):
data += struct.pack('<L', i)
start = time.time()
x = htonl_slice_2(data)
end = time.time()
print("htonl_slice_2 %f" % (end - start))
And the results are:
htonl_slice 3.041000
htonl_struct 0.626000
htonl_shift 0.864000
swap32 0.533000
htonl_slice_2 0.025000
I understand that htonl_shift
works with int
s instead of bytearrays.
It is interesting to see that swap32
is pretty fast, but for a 4 MB array htonl_slice_2
is by far the fastest.
You could also swap them in place and use the original array.
chunk = bytearray([1,2,3,4])
for i in range(0, len(chunk), 2):
chunk[i], chunk[i+1] = chunk[i+1], chunk[i]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With