Python: bytearray vs array

Q: Is Bytearray same as bytes Python?

The difference between bytes() and bytearray() is that bytes() returns an object that cannot be modified, and bytearray() returns an object that can be modified.

Q: What is Bytearray datatype in Python?

The bytearray type is a mutable sequence of integers in the range between 0 and 255. It allows you to work directly with binary data. It can be used to work with low-level data such as that inside of images or arriving directly from the network. Bytearray type inherits methods from both list and str types.

Q: What does Bytearray mean?

A byte array is simply an area of memory containing a group of contiguous (side by side) bytes, such that it makes sense to talk about them in order: the first byte, the second byte etc..

Tags:

python

bytearray

What is the difference between array.array('B') and bytearray?

from array import array

a = array('B', 'abc')
b = bytearray('abc')

a[0] = 100
b[0] = 'd'

print a
print b

Are there any memory or speed differences? What is the preferred use case of each one?

451

asked Aug 09 '12 12:08

Ecir Hana

4 Answers

bytearray is the successor of Python 2.x's string type. It's basically the built-in byte array type. Unlike the original string type, it's mutable.

The array module, on the other hand, was created to create binary data structures to communicate with the outside world (for example, to read/write binary file formats).

Unlike bytearray, it supports all kinds of array elements. It's flexible.

So if you just need an array of bytes, bytearray should work fine. If you need flexible formats (say when the element type of the array needs to be determined at runtime), array.array is your friend.

Without looking at the code, my guess would be that bytearray is probably faster since it doesn't have to consider different element types. But it's possible that array('B') returns a bytearray.

105

answered Sep 24 '22 19:09

Aaron Digulla

bytearray has all the usual str methods. You can thing of it as a mutable str (bytes in Python3)

While array.array is geared to reading and writing files. 'B' is just a special case for array.array

You can see there is quite a difference looking at the dir() of each

>>> dir(bytearray) ['__add__', '__alloc__', '__class__', '__contains__', '__delattr__',  '__delitem__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__',  '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__',  '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__',  '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__',  '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append',  'capitalize', 'center', 'count', 'decode', 'endswith', 'expandtabs', 'extend',  'find', 'fromhex', 'index', 'insert', 'isalnum', 'isalpha', 'isdigit', 'islower',  'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans',  'partition', 'pop', 'remove', 'replace', 'reverse', 'rfind', 'rindex', 'rjust',  'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip',  'swapcase', 'title', 'translate', 'upper', 'zfill'] >>> dir(array) ['__add__', '__class__', '__contains__', '__copy__', '__deepcopy__',  '__delattr__', '__delitem__', '__doc__', '__eq__', '__format__', '__ge__',  '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__',   '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__',  '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__',  '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append',  'buffer_info', 'byteswap', 'count', 'extend', 'frombytes', 'fromfile',  'fromlist', 'fromstring', 'fromunicode', 'index', 'insert', 'itemsize', 'pop',  'remove', 'reverse', 'tobytes', 'tofile', 'tolist', 'tostring', 'tounicode',  'typecode']

answered Sep 21 '22 19:09

John La Rooy

Python Patterns - An Optimization Anecdote is a good read which points to array.array('B') as being fast. Using the timing() function from that essay does show that array.array('B') is faster than bytearray():

#!/usr/bin/env python

from array import array
from struct import pack
from timeit import timeit
from time import clock

def timing(f, n, a):
    start = clock()
    for i in range(n):
        f(a); f(a); f(a); f(a); f(a); f(a); f(a); f(a); f(a); f(a)
    finish = clock()
    return '%s\t%f' % (f.__name__, finish - start)

def time_array(addr):
    return array('B', addr)

def time_bytearray(addr):
    return bytearray(addr)

def array_tostring(addr):
    return array('B', addr).tostring()

def str_bytearray(addr):
    return str(bytearray(addr))

def struct_pack(addr):
    return pack('4B', *addr)

if __name__ == '__main__':
    count = 10000
    addr = '192.168.4.2'
    addr = tuple([int(i) for i in addr.split('.')])
    print('\t\ttiming\t\tfunc\t\tno func')
    print('%s\t%s\t%s' % (timing(time_array, count, addr),
          timeit('time_array((192,168,4,2))', number=count, setup='from __main__ import time_array'),
          timeit("array('B', (192,168,4,2))", number=count, setup='from array import array')))
    print('%s\t%s\t%s' % (timing(time_bytearray, count, addr),
          timeit('time_bytearray((192,168,4,2))', number=count, setup='from __main__ import time_bytearray'),
          timeit('bytearray((192,168,4,2))', number=count)))
    print('%s\t%s\t%s' % (timing(array_tostring, count, addr),
          timeit('array_tostring((192,168,4,2))', number=count, setup='from __main__ import array_tostring'),
          timeit("array('B', (192,168,4,2)).tostring()", number=count, setup='from array import array')))
    print('%s\t%s\t%s' % (timing(str_bytearray, count, addr),
          timeit('str_bytearray((192,168,4,2))', number=count, setup='from __main__ import str_bytearray'),
          timeit('str(bytearray((192,168,4,2)))', number=count)))
    print('%s\t%s\t%s' % (timing(struct_pack, count, addr),
          timeit('struct_pack((192,168,4,2))', number=count, setup='from __main__ import struct_pack'),
          timeit("pack('4B', *(192,168,4,2))", number=count, setup='from struct import pack')))

The timeit measure actually shows array.array('B') is sometimes more than double the speed of bytearray()

I was interested specifically in the fastest way to pack an IP address into a four byte string for sorting. Looks like neither str(bytearray(addr)) nor array('B', addr).tostring() come close to the speed of pack('4B', *addr).

answered Sep 22 '22 19:09

yds

From my test, both used amostly same size of memory but the speed of bytearry is 1.5 times of array when I create a large buffer to read and write.

from array import array
from time import time

s = time()

"""
map = array('B')
for i in xrange(256**4/8):
        map.append(0)
"""

#bytearray
map = bytearray()
for i in xrange(256**4/8):
        map.append(0)
print "init:", time() - s

answered Sep 24 '22 19:09

salmon

Related questions
                            
                                Fast way to filter illegal xml unicode chars in python?
                            
                                Creating square subplots (of equal height and width) in matplotlib
                            
                                Python: module for plotting Gantt charts
                            
                                Suppressing namespace prefixes in ElementTree 1.2
                            
                                Interleave different length lists, elimating duplicates, and preserve order
                            
                                How do I program an Android App with Python? [closed]
                            
                                Why is OrderedDict named in camel case while defaultdict is lower case?
                            
                                Recommended Python publish/subscribe/dispatch module? [closed]
                            
                                Configuring Django to use SQLAlchemy [closed]
                            
                                Dumping a multiprocessing.Queue into a list
                            
                                Has threading in GTK w/ Python changed in PyGObject introspection?
                            
                                Numpy array: sequence too large
                            
                                SOCKET ERROR: [Errno 111] Connection refused
                            
                                Import a module with parameter in python
                            
                                Flask-restful API Authorization. Access current_identity inside decorator
                            
                                Python 3 type hint for a factory method on a base class returning a child class instance
                            
                                Python: Creating a streaming gzip'd file-like?
                            
                                Facebook API and Python [closed]
                            
                                Assign function arguments to `self`
                            
                                List indexing efficiency (python 2 vs python 3)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With