I'm looking at a 3rd party API and they have the following piece of code:
def array_u16 (n): return array('H', '\0\0'*n)
I understand that '\0'
means NULL
, does '\0\0'
have any special meaning or does it just mean 2 NULL
s?
X =[0] * N , produces a list of size N, with all N elements being the value zero. for example, X = [0] * 8 , produces a list of size 8.
[:] is the array slice syntax for every element in the array. This answer here goes more in depth of the general uses: Explain Python's slice notation.
The format() method allows you format string in any way you want. {0} and {1} are format codes. The format code {0} is replaced by the first argument of format() i.e 12 , while {1} is replaced by the second argument of format() i.e 31 .
In Python strings, the backslash "\" is a special character, also called the "escape" character. It is used in representing certain whitespace characters: "\t" is a tab, "\n" is a newline, and "\r" is a carriage return.
The array
class accepts a format character (called a typecode) followed by an initializer. H
means an unsigned short, with a minimum size of 2 bytes so, '\0\0'
satisfies that. The * n
part is to initialize the entire array to NULL bytes.
It just assures that two bytes are provided n
times so the size of the array will be equal to n
. If '\0'
was provided, the resulting array would have a size == n//2
(due to the type-code 'H'
requiring 2
bytes); that is obviously counter intuitive:
>>> array('H', '\0' * 10) # 5 elements
array('H', [0, 0, 0, 0, 0])
>>> array('H', '\0\0' * 10) # 10 elements
array('H', [0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Note that, in Python 3
, if you need the same snippet to work you must provide a bytes
object as the initializer
argument to array
:
>>> array('H', b'\0\0' * 10)
array('H', [0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
As you also can't provide a u''
string in Python 2. Other than that, the behavior stays exactly the same.
So '\0\0'
is for convenience reasons, nothing more. No semantics are attached to '\0\0'
.
No semantics are really attached to '\0'
either (as they do in, for example, C
) '\0'
is just another string in Python.
As a further example for this behavior, take the initialization of an array with a type-code of 'I'
for unsigned ints with a minimum of 2
bytes but 4
on 64bit
builds of Python.
In the spirit of the snippet you've provided, you'd initialize the array by doing something like this:
>>> array('I', b'\0\0\0\0' * 10)
array('I', [0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Yes, four times the b'\0'
string to get 10
elements.
As a final note -- the following timings are performed on Python 3 but 2 is the same -- you might be wondering why he used '\0\0\' * n
instead of the more intuitive-looking [0] * n
to initialize the array. Well, it's quite faster:
n = 10000
%timeit array('I', [0]*n)
1000 loops, best of 3: 212 µs per loop
%timeit array('I', b'\0\0\0\0'* n)
100000 loops, best of 3: 6.36 µs per loop
Of course, you can do better (for type-codes other than 'b'
) by feeding a bytearray
to array
. One way to initialize a bytearray
is by providing an int
as the number of items to initialize with null bytes:
%timeit array('I', bytearray(n))
1000000 loops, best of 3: 1.72 µs per loop
but, if I remember correctly, the bytearray(int)
way of initializing a bytearray might get deprecated in 3.7+
:-).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With