Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast concatenation of bytes() in python3

Tags:

python

list

byte

I have an array of byte-strings in python3 (it's an audio chunks). I want to make one big byte-string from it. Simple implementation is kind of slow. How to do it better?

chunks = []
while not audio.ends():
  chunks.append( bytes(audio.next_buffer()) )
  do_some_chunk_processing()

all_audio=b''
for ch in chunks:
  all_audio += ch

How to do it faster?

like image 650
al.zatv Avatar asked Oct 25 '25 03:10

al.zatv


2 Answers

Use bytearray()

from time import time

c = b'\x02\x03\x05\x07' * 500 # test data

# Method-1 with bytes-string

bytes_string = b''

st = time()
for _ in range(10**4):
    bytes_string += c

print("string concat -> took {} sec".format(time()-st))

# Method-2 with bytes-array

bytes_arr = bytearray()

st = time()
for _ in range(10**4):
    bytes_arr.extend(c)
# convert byte_arr to bytes_string via
bytes_string = bytes(bytes_arr)

print("bytearray extend/concat -> took {} sec".format(time()-st))

benchmark in my Win10|Corei7-7th Gen shows:

string concat -> took 67.28 sec
bytearray extend/concat -> took 0.089 sec

the code is pretty self-explanatory. instead of using string+=next_block, use bytearray.extend(next_block). After building bytearray you can use bytes(bytearray) to get the bytes-string.

like image 51
Amin Pial Avatar answered Oct 27 '25 18:10

Amin Pial


One approach you could try and measure would be to use bytes.join:

all_audio = b''.join(chunks)

The reason this might be faster is that this does a pre-pass over the chunks to find out how big all_audio needs to be, allocates exactly the right size once, then concatenates it in one go.

Reference

like image 41
Wander Nauta Avatar answered Oct 27 '25 18:10

Wander Nauta



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!