Suppose I have a list contains un-equal length lists.
a = [ [ 1, 2, 3], [2], [2, 4] ]
What is the best way to obtain a zero padding numpy array with standard shape?
zero_a = [ [1, 2, 3], [2, 0, 0], [2, 4, 0] ]
I know I can use list operation like
n = max( map( len, a ) )
map( lambda x : x.extend( [0] * (n-len(x)) ), a )
zero_a = np.array(zero_a)
but I was wondering is there any easy numpy way to do this work?
pad() function is used to pad the Numpy arrays. Sometimes there is a need to perform padding in Numpy arrays, then numPy. pad() function is used. The function returns the padded array of rank equal to the given array and the shape will increase according to pad_width.
Access Array Elements You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.
The array padding transformation sets a dimension in an array to a new size. The goal of this transformation is to reduce the number of memory system conflicts. The transformation is applied to a full function AST. The new size can be specified by the user or can be computed automatically.
As numpy have to know size of an array just prior to its initialization, best solution would be a numpy based constructor for such case. Sadly, as far as I know, there is none.
Probably not ideal, but slightly faster solution will be create numpy array with zeros and fill with list values.
import numpy as np
def pad_list(lst):
inner_max_len = max(map(len, lst))
map(lambda x: x.extend([0]*(inner_max_len-len(x))), lst)
return np.array(lst)
def apply_to_zeros(lst, dtype=np.int64):
inner_max_len = max(map(len, lst))
result = np.zeros([len(lst), inner_max_len], dtype)
for i, row in enumerate(lst):
for j, val in enumerate(row):
result[i][j] = val
return result
Test case:
>>> pad_list([[ 1, 2, 3], [2], [2, 4]])
array([[1, 2, 3],
[2, 0, 0],
[2, 4, 0]])
>>> apply_to_zeros([[ 1, 2, 3], [2], [2, 4]])
array([[1, 2, 3],
[2, 0, 0],
[2, 4, 0]])
Performance:
>>> timeit.timeit('from __main__ import pad_list as f; f([[ 1, 2, 3], [2], [2, 4]])', number = 10000)
0.3937079906463623
>>> timeit.timeit('from __main__ import apply_to_zeros as f; f([[ 1, 2, 3], [2], [2, 4]])', number = 10000)
0.1344289779663086
Not strictly a function from numpy, but you could do something like this
from itertools import izip, izip_longest
import numpy
a=[[1,2,3], [4], [5,6]]
res1 = numpy.array(list(izip(*izip_longest(*a, fillvalue=0))))
or, alternatively:
res2=numpy.array(list(izip_longest(*a, fillvalue=0))).transpose()
If you use python 3, use zip
, and itertools.zip_longest
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With