Replace NaN with empty list in a pandas dataframe

Tags:

I'm trying to replace some NaN values in my data with an empty list []. However the list is represented as a str and doesn't allow me to properly apply the len() function. is there anyway to replace a NaN value with an actual empty list in pandas?

In [28]: d = pd.DataFrame({'x' : [[1,2,3], [1,2], np.NaN, np.NaN], 'y' : [1,2,3,4]})  In [29]: d Out[29]:            x  y 0  [1, 2, 3]  1 1     [1, 2]  2 2        NaN  3 3        NaN  4  In [32]: d.x.replace(np.NaN, '[]', inplace=True)  In [33]: d Out[33]:            x  y 0  [1, 2, 3]  1 1     [1, 2]  2 2         []  3 3         []  4  In [34]: d.x.apply(len) Out[34]: 0    3 1    2 2    2 3    2 Name: x, dtype: int64

442

asked Jul 22 '15 15:07

moku

1 Answers

This works using isnull and loc to mask the series:

In [90]: d.loc[d.isnull()] = d.loc[d.isnull()].apply(lambda x: []) d  Out[90]: 0    [1, 2, 3] 1       [1, 2] 2           [] 3           [] dtype: object  In [91]: d.apply(len)  Out[91]: 0    3 1    2 2    0 3    0 dtype: int64

You have to do this using apply in order for the list object to not be interpreted as an array to assign back to the df which will try to align the shape back to the original series

EDIT

Using your updated sample the following works:

In [100]: d.loc[d['x'].isnull(),['x']] = d.loc[d['x'].isnull(),'x'].apply(lambda x: []) d  Out[100]:            x  y 0  [1, 2, 3]  1 1     [1, 2]  2 2         []  3 3         []  4  In [102]:     d['x'].apply(len)  Out[102]: 0    3 1    2 2    0 3    0 Name: x, dtype: int64

172

answered Oct 18 '22 15:10

EdChum

Related questions
                            
                                Flexbox evenly sized elements regardless of contents
                            
                                Java 8 - Convert LocalDate to ZonedDateTime
                            
                                How to list the configured repositories?
                            
                                Sass % operator [duplicate]
                            
                                Precedence of "in" in Python
                            
                                What's the real meaning about 'Everything that exists is an object' in R?
                            
                                Compiler standards support (c++11, c++14, c++17)
                            
                                How to deeply map object keys with JavaScript (lodash)?
                            
                                Mongoose single embedded sub-document default
                            
                                How to remove an array element with jq?
                            
                                Angular2 HTTP GET - Cast response into full object
                            
                                How to integrate Flask & Scrapy?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With