In C++, I can create a array like... <pre class="prettyprint"><code>int* a = new int[10]; </code></pre> in python,I just know that I can declare a list,than append some items,or like.. <pre class="prettyprint"><code>l = [1,2,3,4] l = range(10) </code></pre> Can I initialize a list by a given size,like c++,and do not do any assignment?

(tl;dr: The exact answer to your question is <code>numpy.empty</code> or <code>numpy.empty_like</code>, but you likely don't care and can get away with using <code>myList = [None]*10000</code>.) <h3>Simple methods</h3> You can initialize your list to all the same element. Whether it semantically makes sense to use a non-numeric value (that will give an error later if you use it, which is a good thing) or something like 0 (unusual? maybe useful if you're writing a sparse matrix or the 'default' value should be 0 and you're not worried about bugs) is up to you: <pre class="prettyprint"><code>>>> [None for _ in range(10)] [None, None, None, None, None, None, None, None, None, None] </code></pre> (Here <code>_</code> is just a variable name, you could have used <code>i</code>.) You can also do so like this: <pre class="prettyprint"><code>>>> [None]*10 [None, None, None, None, None, None, None, None, None, None] </code></pre> You probably don't need to optimize this. You can also append to the array every time you need to: <pre class="prettyprint"><code>>>> x = [] >>> for i in range(10): >>> x.append(i) </code></pre> <hr> <h3>Performance comparison of simple methods</h3> Which is best? <pre class="prettyprint"><code>>>> def initAndWrite_test(): ... x = [None]*10000 ... for i in range(10000): ... x[i] = i ... >>> def initAndWrite2_test(): ... x = [None for _ in range(10000)] ... for i in range(10000): ... x[i] = i ... >>> def appendWrite_test(): ... x = [] ... for i in range(10000): ... x.append(i) </code></pre> Results in python2.7: <pre class="prettyprint"><code>>>> import timeit >>> for f in [initAndWrite_test, initAndWrite2_test, appendWrite_test]: ... print('{} takes {} usec/loop'.format(f.__name__, timeit.timeit(f, number=1000)*1000)) ... initAndWrite_test takes 714.596033096 usec/loop initAndWrite2_test takes 981.526136398 usec/loop appendWrite_test takes 908.597946167 usec/loop </code></pre> Results in python 3.2: <pre class="prettyprint"><code>initAndWrite_test takes 641.3581371307373 usec/loop initAndWrite2_test takes 1033.6499214172363 usec/loop appendWrite_test takes 895.9040641784668 usec/loop </code></pre> As we can see, it is likely better to do the idiom <code>[None]*10000</code> in both python2 and python3. However, if one is doing anything more complicated than assignment (such as anything complicated to generate or process every element in the list), then the overhead becomes a meaninglessly small fraction of the cost. That is, such optimization is premature to worry about if you're doing anything reasonable with the elements of your list. <hr> <h3>Uninitialized memory</h3> These are all however inefficient because they go through memory, writing something in the process. In C this is different: an uninitialized array is filled with random garbage memory (sidenote: that has been reallocated from the system, and can be a security risk when you allocate or fail to mlock and/or fail to delete memory when closing the program). This is a design choice, designed for speedup: the makers of the C language thought that it was better not to automatically initialize memory, and that was the correct choice. This is not an asymptotic speedup (because it's <code>O(N)</code>), but for example you wouldn't need to first initialize your entire memory block before you overwrite with stuff you actually care about. This, if it were possible, is equivalent to something like (pseudo-code) <code>x = list(size=10000)</code>. If you want something similar in python, you can use the <code>numpy</code> numerical matrix/N-dimensional-array manipulation package. Specifically, <code>numpy.empty</code> or <code>numpy.empty_like</code> That is the real answer to your question.

How to create a fix size list in python?

Tags:

python

list

In C++, I can create a array like...

int* a = new int[10];

in python,I just know that I can declare a list,than append some items,or like..

l = [1,2,3,4] l = range(10)

Can I initialize a list by a given size,like c++,and do not do any assignment?

406

asked May 16 '12 10:05

wtm

2 Answers

(tl;dr: The exact answer to your question is numpy.empty or numpy.empty_like, but you likely don't care and can get away with using myList = [None]*10000.)

Simple methods

You can initialize your list to all the same element. Whether it semantically makes sense to use a non-numeric value (that will give an error later if you use it, which is a good thing) or something like 0 (unusual? maybe useful if you're writing a sparse matrix or the 'default' value should be 0 and you're not worried about bugs) is up to you:

>>> [None for _ in range(10)] [None, None, None, None, None, None, None, None, None, None]

(Here _ is just a variable name, you could have used i.)

You can also do so like this:

>>> [None]*10 [None, None, None, None, None, None, None, None, None, None]

You probably don't need to optimize this. You can also append to the array every time you need to:

>>> x = [] >>> for i in range(10): >>>    x.append(i)

Performance comparison of simple methods

Which is best?

>>> def initAndWrite_test(): ...  x = [None]*10000 ...  for i in range(10000): ...   x[i] = i ...  >>> def initAndWrite2_test(): ...  x = [None for _ in range(10000)] ...  for i in range(10000): ...   x[i] = i ...  >>> def appendWrite_test(): ...  x = [] ...  for i in range(10000): ...   x.append(i)

Results in python2.7:

>>> import timeit >>> for f in [initAndWrite_test, initAndWrite2_test, appendWrite_test]: ...  print('{} takes {} usec/loop'.format(f.__name__, timeit.timeit(f, number=1000)*1000)) ...  initAndWrite_test takes 714.596033096 usec/loop initAndWrite2_test takes 981.526136398 usec/loop appendWrite_test takes 908.597946167 usec/loop

Results in python 3.2:

initAndWrite_test takes 641.3581371307373 usec/loop initAndWrite2_test takes 1033.6499214172363 usec/loop appendWrite_test takes 895.9040641784668 usec/loop

As we can see, it is likely better to do the idiom [None]*10000 in both python2 and python3. However, if one is doing anything more complicated than assignment (such as anything complicated to generate or process every element in the list), then the overhead becomes a meaninglessly small fraction of the cost. That is, such optimization is premature to worry about if you're doing anything reasonable with the elements of your list.

Uninitialized memory

These are all however inefficient because they go through memory, writing something in the process. In C this is different: an uninitialized array is filled with random garbage memory (sidenote: that has been reallocated from the system, and can be a security risk when you allocate or fail to mlock and/or fail to delete memory when closing the program). This is a design choice, designed for speedup: the makers of the C language thought that it was better not to automatically initialize memory, and that was the correct choice.

This is not an asymptotic speedup (because it's O(N)), but for example you wouldn't need to first initialize your entire memory block before you overwrite with stuff you actually care about. This, if it were possible, is equivalent to something like (pseudo-code) x = list(size=10000).

If you want something similar in python, you can use the numpy numerical matrix/N-dimensional-array manipulation package. Specifically, numpy.empty or numpy.empty_like

That is the real answer to your question.

126

answered Oct 03 '22 10:10

ninjagecko

You can use this: [None] * 10. But this won't be "fixed size" you can still append, remove ... This is how lists are made.

You could make it a tuple (tuple([None] * 10)) to fix its width, but again, you won't be able to change it (not in all cases, only if the items stored are mutable).

Another option, closer to your requirement, is not a list, but a collections.deque with a maximum length. It's the maximum size, but it could be smaller.

import collections max_4_items = collections.deque([None] * 4, maxlen=4)

But, just use a list, and get used to the "pythonic" way of doing things.

answered Oct 03 '22 09:10

jadkik94

Related questions
                            
                                Custom PyCharm docstring stubs (i.e. for google docstring or numpydoc formats)
                            
                                Python: why are * and ** faster than / and sqrt()?
                            
                                Gunicorn, no module named 'myproject
                            
                                Merge pandas dataframes where one value is between two others [duplicate]
                            
                                What is the difference between chain and chain.from_iterable in itertools?
                            
                                Class with too many parameters: better design strategy?
                            
                                Types that define `__eq__` are unhashable?
                            
                                Capturing group with findall?
                            
                                python: naming a module that has a two-word name
                            
                                Matplotlib - label each bin
                            
                                Adding calculated column(s) to a dataframe in pandas
                            
                                Concatenate rows of two dataframes in pandas
                            
                                Why is it possible to replace sometimes set() with {}?
                            
                                Python: skip comment lines marked with # in csv.DictReader
                            
                                'Can't set attribute' with new-style properties in Python
                            
                                What exactly is a "raw string regex" and how can you use it?
                            
                                Why does Python's __import__ require fromlist?
                            
                                Why are NumPy arrays so fast?
                            
                                Using Django database layer outside of Django?
                            
                                Could not find library geos_c or load any of its variants

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With