Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Meaning of '\0\0' in Python?

I'm looking at a 3rd party API and they have the following piece of code:

def array_u16 (n): return array('H', '\0\0'*n)

I understand that '\0' means NULL, does '\0\0' have any special meaning or does it just mean 2 NULLs?

like image 824
flashburn Avatar asked Nov 07 '16 18:11

flashburn


People also ask

What does 0 ]* n mean in Python?

X =[0] * N , produces a list of size N, with all N elements being the value zero. for example, X = [0] * 8 , produces a list of size 8.

What is a [:] in Python?

[:] is the array slice syntax for every element in the array. This answer here goes more in depth of the general uses: Explain Python's slice notation.

Is 0 A string in Python?

The format() method allows you format string in any way you want. {0} and {1} are format codes. The format code {0} is replaced by the first argument of format() i.e 12 , while {1} is replaced by the second argument of format() i.e 31 .

What does \r and \n do in Python?

In Python strings, the backslash "\" is a special character, also called the "escape" character. It is used in representing certain whitespace characters: "\t" is a tab, "\n" is a newline, and "\r" is a carriage return.


2 Answers

The array class accepts a format character (called a typecode) followed by an initializer. H means an unsigned short, with a minimum size of 2 bytes so, '\0\0' satisfies that. The * n part is to initialize the entire array to NULL bytes.

like image 179
AndyG Avatar answered Sep 22 '22 20:09

AndyG


It just assures that two bytes are provided n times so the size of the array will be equal to n. If '\0' was provided, the resulting array would have a size == n//2 (due to the type-code 'H' requiring 2 bytes); that is obviously counter intuitive:

>>> array('H', '\0' * 10)    # 5 elements
array('H', [0, 0, 0, 0, 0])
>>> array('H', '\0\0' * 10)  # 10 elements
array('H', [0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Note that, in Python 3, if you need the same snippet to work you must provide a bytes object as the initializer argument to array:

>>> array('H', b'\0\0' * 10)   
array('H', [0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

As you also can't provide a u'' string in Python 2. Other than that, the behavior stays exactly the same.

So '\0\0' is for convenience reasons, nothing more. No semantics are attached to '\0\0'.

No semantics are really attached to '\0' either (as they do in, for example, C) '\0' is just another string in Python.


As a further example for this behavior, take the initialization of an array with a type-code of 'I' for unsigned ints with a minimum of 2 bytes but 4 on 64bit builds of Python.

In the spirit of the snippet you've provided, you'd initialize the array by doing something like this:

>>> array('I', b'\0\0\0\0' * 10)
array('I', [0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Yes, four times the b'\0' string to get 10 elements.


As a final note -- the following timings are performed on Python 3 but 2 is the same -- you might be wondering why he used '\0\0\' * n instead of the more intuitive-looking [0] * n to initialize the array. Well, it's quite faster:

n = 10000
%timeit array('I', [0]*n)
1000 loops, best of 3: 212 µs per loop

%timeit array('I', b'\0\0\0\0'* n)
100000 loops, best of 3: 6.36 µs per loop

Of course, you can do better (for type-codes other than 'b') by feeding a bytearray to array. One way to initialize a bytearray is by providing an int as the number of items to initialize with null bytes:

%timeit array('I', bytearray(n))
1000000 loops, best of 3: 1.72 µs per loop

but, if I remember correctly, the bytearray(int) way of initializing a bytearray might get deprecated in 3.7+ :-).

like image 40
Dimitris Fasarakis Hilliard Avatar answered Sep 24 '22 20:09

Dimitris Fasarakis Hilliard