Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

zero padding numpy array

Tags:

python

numpy

Suppose I have a list contains un-equal length lists.

a = [ [ 1, 2, 3], [2], [2, 4] ]

What is the best way to obtain a zero padding numpy array with standard shape?

zero_a = [ [1, 2, 3], [2, 0, 0], [2, 4, 0] ]

I know I can use list operation like

n = max( map( len, a ) )
map( lambda x : x.extend( [0] * (n-len(x)) ), a )
zero_a = np.array(zero_a)

but I was wondering is there any easy numpy way to do this work?

like image 732
Xingzhong Avatar asked Nov 09 '13 16:11

Xingzhong


People also ask

What is padding in NumPy?

pad() function is used to pad the Numpy arrays. Sometimes there is a need to perform padding in Numpy arrays, then numPy. pad() function is used. The function returns the padded array of rank equal to the given array and the shape will increase according to pad_width.

Are NumPy arrays zero indexed?

Access Array Elements You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

What is padding in array?

The array padding transformation sets a dimension in an array to a new size. The goal of this transformation is to reduce the number of memory system conflicts. The transformation is applied to a full function AST. The new size can be specified by the user or can be computed automatically.


2 Answers

As numpy have to know size of an array just prior to its initialization, best solution would be a numpy based constructor for such case. Sadly, as far as I know, there is none.

Probably not ideal, but slightly faster solution will be create numpy array with zeros and fill with list values.

import numpy as np
def pad_list(lst):
    inner_max_len = max(map(len, lst))
    map(lambda x: x.extend([0]*(inner_max_len-len(x))), lst)
    return np.array(lst)

def apply_to_zeros(lst, dtype=np.int64):
    inner_max_len = max(map(len, lst))
    result = np.zeros([len(lst), inner_max_len], dtype)
    for i, row in enumerate(lst):
        for j, val in enumerate(row):
            result[i][j] = val
    return result

Test case:

>>> pad_list([[ 1, 2, 3], [2], [2, 4]])
array([[1, 2, 3],
       [2, 0, 0],
       [2, 4, 0]])

>>> apply_to_zeros([[ 1, 2, 3], [2], [2, 4]])
array([[1, 2, 3],
       [2, 0, 0],
       [2, 4, 0]])

Performance:

>>> timeit.timeit('from __main__ import pad_list as f; f([[ 1, 2, 3], [2], [2, 4]])', number = 10000)
0.3937079906463623
>>> timeit.timeit('from __main__ import apply_to_zeros as f; f([[ 1, 2, 3], [2], [2, 4]])', number = 10000)
0.1344289779663086
like image 190
alko Avatar answered Sep 24 '22 19:09

alko


Not strictly a function from numpy, but you could do something like this

from itertools import izip, izip_longest
import numpy
a=[[1,2,3], [4], [5,6]]
res1 = numpy.array(list(izip(*izip_longest(*a, fillvalue=0))))

or, alternatively:

res2=numpy.array(list(izip_longest(*a, fillvalue=0))).transpose()

If you use python 3, use zip, and itertools.zip_longest.

like image 42
ilmarinen Avatar answered Sep 22 '22 19:09

ilmarinen