Numpy: Fix array with rows of different lengths by filling the empty elements with zeros

Tags:

The functionality I am looking for looks something like this:

data = np.array([[1, 2, 3, 4],                  [2, 3, 1],                  [5, 5, 5, 5],                  [1, 1]])  result = fix(data) print result  [[ 1.  2.  3.  4.]  [ 2.  3.  1.  0.]  [ 5.  5.  5.  5.]  [ 1.  1.  0.  0.]]

These data arrays I'm working with are really large so I would really appreciate the most efficient solution.

Edit: Data is read in from disk as a python list of lists.

768

asked Aug 16 '15 17:08

user2909415

2 Answers

This could be one approach -

def numpy_fillna(data):     # Get lengths of each row of data     lens = np.array([len(i) for i in data])      # Mask of valid places in each row     mask = np.arange(lens.max()) < lens[:,None]      # Setup output array and put elements from data into masked positions     out = np.zeros(mask.shape, dtype=data.dtype)     out[mask] = np.concatenate(data)     return out

Sample input, output -

In [222]: # Input object dtype array      ...: data = np.array([[1, 2, 3, 4],      ...:                  [2, 3, 1],      ...:                  [5, 5, 5, 5, 8 ,9 ,5],      ...:                  [1, 1]])  In [223]: numpy_fillna(data) Out[223]:  array([[1, 2, 3, 4, 0, 0, 0],        [2, 3, 1, 0, 0, 0, 0],        [5, 5, 5, 5, 8, 9, 5],        [1, 1, 0, 0, 0, 0, 0]], dtype=object)

192

answered Sep 29 '22 06:09

Divakar

You could use pandas instead of numpy:

In [1]: import pandas as pd  In [2]: df = pd.DataFrame([[1, 2, 3, 4],    ...:                    [2, 3, 1],    ...:                    [5, 5, 5, 5],    ...:                    [1, 1]], dtype=float)   In [3]: df.fillna(0.0).values Out[3]:  array([[ 1.,  2.,  3.,  4.],        [ 2.,  3.,  1.,  0.],        [ 5.,  5.,  5.,  5.],        [ 1.,  1.,  0.,  0.]])

answered Sep 29 '22 08:09

Eastsun

Related questions
                            
                                Numpy individual element access slower than for lists
                            
                                How to convert a given ordinal number (from Excel) to a date
                            
                                In Django 1.9, what's the convention for using JSONField (native postgres jsonb)?
                            
                                Pipenv with Conda?
                            
                                How to get filename from Content-Disposition in headers
                            
                                Embedding IPython Qt console in a PyQt application
                            
                                How to skip the rest of tests in the class if one has failed?
                            
                                What does "del" do exactly?
                            
                                How to add a key-value to JSON data retrieved from a file?
                            
                                multiple key value pairs in dict comprehension
                            
                                How do you access tree depth in Python's scikit-learn?
                            
                                Using Google API for Python- where do I get the client_secrets.json file from?
                            
                                How can I optimize this Python code to generate all words with word-distance 1?
                            
                                Decorators in Ruby (migrating from Python)
                            
                                How can I efficiently process a numpy array in blocks similar to Matlab's blkproc (blockproc) function
                            
                                Numpy Adding two vectors with different sizes
                            
                                python supervisord program dependency
                            
                                Add minor gridlines to matplotlib plot using seaborn
                            
                                tornado vs wsgi(with gunicorn)
                            
                                Stack two pandas data frames

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Numpy: Fix array with rows of different lengths by filling the empty elements with zeros

Tags:

performance

python

arrays

numpy

python-2.7

user2909415

People also ask

2 Answers

Divakar

Eastsun

Recent Activity

Donate For Us