Supposedly, the pandas.apply() function does not apply to null elements. However, this is not occuring in the following code. Why is this happening? <pre class="prettyprint"><code>import pandas as pd df = pd.Series([[1,2],[2,3,4,5],None]) df 0 [1, 2] 1 [2, 3, 4, 5] 2 None dtype: object df.apply(lambda x: len(x)) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\core\series.py", l ine 2169, in apply mapped = lib.map_infer(values, f, convert=convert_dtype) File "pandas\src\inference.pyx", line 1059, in pandas.lib.map_infer (pandas\li b.c:62578) File "<stdin>", line 1, in <lambda> TypeError: object of type 'NoneType' has no len() </code></pre>

None and nan are semantically equivalent. There is no point in replacing None with numpy.nan. <code>apply</code> will still apply the function to NaN elements. <pre class="prettyprint"><code>df[2] = numpy.nan df.apply(lambda x: print(x)) Output: [1, 2] [2, 3, 4, 5] nan </code></pre> You have to check for a missing value in your function you want to apply or use <code>pandas.dropna</code> and apply the function to the result: <pre class="prettyprint"><code>df.dropna().apply(lambda x: print(x)) </code></pre> Alternatively, use <code>pandas.notnull()</code> which returns a series of booleans: <pre class="prettyprint"><code>df[df.notnull()].apply(lambda x: print(x)) </code></pre> Please also read: http://pandas.pydata.org/pandas-docs/stable/missing_data.html And specifically, this: <blockquote> Warning: One has to be mindful that in python (and numpy), the nan's don’t compare equal, but None's do. Note that Pandas/numpy uses the fact that np.nan != np.nan, and treats None like np.nan. </blockquote>

Why is pandas.apply() executing on null elements?

Tags:

python

pandas

Supposedly, the pandas.apply() function does not apply to null elements. However, this is not occuring in the following code. Why is this happening?

import pandas as pd
df = pd.Series([[1,2],[2,3,4,5],None])
df
0          [1, 2]
1    [2, 3, 4, 5]
2            None
dtype: object
df.apply(lambda x: len(x))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\core\series.py", l
ine 2169, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas\src\inference.pyx", line 1059, in pandas.lib.map_infer (pandas\li
b.c:62578)
  File "<stdin>", line 1, in <lambda>
TypeError: object of type 'NoneType' has no len()

391

asked Jan 03 '16 08:01

Alex

1 Answers

None and nan are semantically equivalent. There is no point in replacing None with numpy.nan. apply will still apply the function to NaN elements.

df[2] = numpy.nan
df.apply(lambda x: print(x))

Output: [1, 2]
        [2, 3, 4, 5]
        nan

You have to check for a missing value in your function you want to apply or use pandas.dropna and apply the function to the result:

df.dropna().apply(lambda x: print(x))

Alternatively, use pandas.notnull() which returns a series of booleans:

df[df.notnull()].apply(lambda x: print(x))

Please also read: http://pandas.pydata.org/pandas-docs/stable/missing_data.html

And specifically, this:

Warning:

One has to be mindful that in python (and numpy), the nan's don’t compare equal, but None's do. Note that Pandas/numpy uses the fact that np.nan != np.nan, and treats None like np.nan.

171

answered Oct 16 '22 21:10

kliron

Related questions
                            
                                Python requests.post multipart/form-data [duplicate]
                            
                                Iterative solving of sparse systems of linear equations with (M, N) right-hand size matrix
                            
                                Django template: Embed css from file
                            
                                How can I obtain the same 'special' solutions to underdetermined linear systems that Matlab's `A \ b` (mldivide) operator returns using numpy/scipy?
                            
                                Lists are the same but not considered equal?
                            
                                Overloading the [] operator in python class to refer to a numpy.array data member
                            
                                Spark using Python : save RDD output into text files
                            
                                Mutable default argument for a Python namedtuple
                            
                                Flask-Admin / Flask-SQLAlchemy: set user_id = current_user for INSERT
                            
                                MySQLdb raises "execute() first" error even though I execute before calling fetchall
                            
                                Where can the RDS_DB_NAME setting for an Elastic Beanstalk environment be changed
                            
                                Difference between local and dense layers in CNNs
                            
                                Can't reproduce distance value between sources obtained with astropy
                            
                                How to change request url before making request in scrapy?
                            
                                Installed Anaconda for python 2 and 3. Can't run 2
                            
                                Errno13, Permission denied when trying to read file
                            
                                How to scrape elements that immediately follows a certain element?
                            
                                Django Admin - remove permissions from the list on Add/Edit Group page
                            
                                Pandas groupby slice of a string
                            
                                print first paragraph in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With