Suppose I have a list contains un-equal length lists. <pre class="prettyprint"><code>a = [ [ 1, 2, 3], [2], [2, 4] ] </code></pre> What is the best way to obtain a zero padding numpy array with standard shape? <pre class="prettyprint"><code>zero_a = [ [1, 2, 3], [2, 0, 0], [2, 4, 0] ] </code></pre> I know I can use list operation like <pre class="prettyprint"><code>n = max( map( len, a ) ) map( lambda x : x.extend( [0] * (n-len(x)) ), a ) zero_a = np.array(zero_a) </code></pre> but I was wondering is there any easy numpy way to do this work?

Not strictly a function from numpy, but you could do something like this <pre class="prettyprint"><code>from itertools import izip, izip_longest import numpy a=[[1,2,3], [4], [5,6]] res1 = numpy.array(list(izip(*izip_longest(*a, fillvalue=0)))) </code></pre> or, alternatively: <pre class="prettyprint"><code>res2=numpy.array(list(izip_longest(*a, fillvalue=0))).transpose() </code></pre> If you use python 3, use <code>zip</code>, and <code>itertools.zip_longest</code>.

zero padding numpy array

Tags:

python

numpy

Suppose I have a list contains un-equal length lists.

a = [ [ 1, 2, 3], [2], [2, 4] ]

What is the best way to obtain a zero padding numpy array with standard shape?

zero_a = [ [1, 2, 3], [2, 0, 0], [2, 4, 0] ]

I know I can use list operation like

n = max( map( len, a ) )
map( lambda x : x.extend( [0] * (n-len(x)) ), a )
zero_a = np.array(zero_a)

but I was wondering is there any easy numpy way to do this work?

732

asked Nov 09 '13 16:11

Xingzhong

2 Answers

As numpy have to know size of an array just prior to its initialization, best solution would be a numpy based constructor for such case. Sadly, as far as I know, there is none.

Probably not ideal, but slightly faster solution will be create numpy array with zeros and fill with list values.

import numpy as np
def pad_list(lst):
    inner_max_len = max(map(len, lst))
    map(lambda x: x.extend([0]*(inner_max_len-len(x))), lst)
    return np.array(lst)

def apply_to_zeros(lst, dtype=np.int64):
    inner_max_len = max(map(len, lst))
    result = np.zeros([len(lst), inner_max_len], dtype)
    for i, row in enumerate(lst):
        for j, val in enumerate(row):
            result[i][j] = val
    return result

Test case:

>>> pad_list([[ 1, 2, 3], [2], [2, 4]])
array([[1, 2, 3],
       [2, 0, 0],
       [2, 4, 0]])

>>> apply_to_zeros([[ 1, 2, 3], [2], [2, 4]])
array([[1, 2, 3],
       [2, 0, 0],
       [2, 4, 0]])

Performance:

>>> timeit.timeit('from __main__ import pad_list as f; f([[ 1, 2, 3], [2], [2, 4]])', number = 10000)
0.3937079906463623
>>> timeit.timeit('from __main__ import apply_to_zeros as f; f([[ 1, 2, 3], [2], [2, 4]])', number = 10000)
0.1344289779663086

190

answered Sep 24 '22 19:09

alko

Not strictly a function from numpy, but you could do something like this

from itertools import izip, izip_longest
import numpy
a=[[1,2,3], [4], [5,6]]
res1 = numpy.array(list(izip(*izip_longest(*a, fillvalue=0))))

or, alternatively:

res2=numpy.array(list(izip_longest(*a, fillvalue=0))).transpose()

If you use python 3, use zip, and itertools.zip_longest.

answered Sep 22 '22 19:09

ilmarinen

Related questions
                            
                                Public variables in Python classes?
                            
                                How to debug Django unit tests?
                            
                                How to configure interactive python to allow blank lines inside methods
                            
                                Elegant way to convert python datetime.timedelta to dateutil.relativedelta
                            
                                Iterating XML with lxml in Python: how to know how much of the input file has been read?
                            
                                Bluetooth for Python 2.7? [closed]
                            
                                Django object not saving even after "save" call
                            
                                how to set a widget's size in tkinter?
                            
                                Static inner class in python
                            
                                Standard error ignoring NaN in pandas groupby groups
                            
                                Python builtin functions aren't really functions, right?
                            
                                How can I override the default template used by a page or article in pelican?
                            
                                Import custom modules on IPython.parallel engines with sync_imports()
                            
                                python_x64 + C library compiled with mingw_x64 on Windows7 Py_InitModule4
                            
                                Unable to connect to windows shares
                            
                                How to read a big binary file and split its content by some marker
                            
                                How to use pandas to group pivot table results by week?
                            
                                I can not connect to https waitress wsgi server
                            
                                Choose adapter dynamically depending on librarie(s) installed
                            
                                Linking and Loading in interpreted languages

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With