I have a large dataframe which looks as: <pre class="prettyprint"><code>df1['A'].ix[1:3] 2017-01-01 02:00:00 [33, 34, 39] 2017-01-01 03:00:00 [3, 43, 9] </code></pre> I want to replace each element greater than 9 with 11. So, the desired output for above example is: <pre class="prettyprint"><code>df1['A'].ix[1:3] 2017-01-01 02:00:00 [11, 11, 11] 2017-01-01 03:00:00 [3, 11, 9] </code></pre> Edit: My actual dataframe has about 20,000 rows and each row has list of size 2000. Is there a way to use <code>numpy.minimum</code> function for each row? I assume that it will be faster than <code>list comprehension</code> method?

Very simply : <code>df[df > 9] = 11</code>

You can use <code>apply</code> with <code>list comprehension</code>: <pre class="prettyprint"><code>df1['A'] = df1['A'].apply(lambda x: [y if y <= 9 else 11 for y in x]) print (df1) A 2017-01-01 02:00:00 [11, 11, 11] 2017-01-01 03:00:00 [3, 11, 9] </code></pre> Faster solution is first convert to <code>numpy array</code> and then use <code>numpy.where</code>: <pre class="prettyprint"><code>a = np.array(df1['A'].values.tolist()) print (a) [[33 34 39] [ 3 43 9]] df1['A'] = np.where(a > 9, 11, a).tolist() print (df1) A 2017-01-01 02:00:00 [11, 11, 11] 2017-01-01 03:00:00 [3, 11, 9] </code></pre>

Replacing values greater than a number in pandas dataframe

Tags:

python

database

pandas

I have a large dataframe which looks as:

df1['A'].ix[1:3] 2017-01-01 02:00:00    [33, 34, 39] 2017-01-01 03:00:00    [3, 43, 9]

I want to replace each element greater than 9 with 11.

So, the desired output for above example is:

df1['A'].ix[1:3] 2017-01-01 02:00:00    [11, 11, 11] 2017-01-01 03:00:00    [3, 11, 9]

Edit:

My actual dataframe has about 20,000 rows and each row has list of size 2000.

Is there a way to use numpy.minimum function for each row? I assume that it will be faster than list comprehension method?

942

asked May 03 '17 10:05

Zanam

2 Answers

Very simply : df[df > 9] = 11

195

answered Sep 21 '22 08:09

Edouard Cuny

You can use apply with list comprehension:

df1['A'] = df1['A'].apply(lambda x: [y if y <= 9 else 11 for y in x]) print (df1)                                 A 2017-01-01 02:00:00  [11, 11, 11] 2017-01-01 03:00:00    [3, 11, 9]

Faster solution is first convert to numpy array and then use numpy.where:

a = np.array(df1['A'].values.tolist()) print (a) [[33 34 39]  [ 3 43  9]]  df1['A'] = np.where(a > 9, 11, a).tolist() print (df1)                                 A 2017-01-01 02:00:00  [11, 11, 11] 2017-01-01 03:00:00    [3, 11, 9]

answered Sep 23 '22 08:09

jezrael

Related questions
                            
                                Pip default behavior conflicts with virtualenv?
                            
                                pymongo- How can I have distinct values for a field along with other query parameters
                            
                                Python / Pillow: How to scale an image
                            
                                Pandas - Slice large dataframe into chunks
                            
                                How would I build python myself from source code on Ubuntu?
                            
                                How to fail a python unittest in setUpClass?
                            
                                How to speed up bulk insert to MS SQL Server using pyodbc
                            
                                Understanding string reversal via slicing
                            
                                Python - Check if the last characters in a string are numbers
                            
                                Remove spurious small islands of noise in an image - Python OpenCV
                            
                                Download YouTube video using Python to a certain directory
                            
                                python map string split list
                            
                                Django string to date format
                            
                                how to use pandas filter with IQR
                            
                                PIL Image.resize() not resizing the picture
                            
                                How to check if a variable's type is primitive?
                            
                                Upload Image using POST form data in Python-requests
                            
                                How can I change the Python version in Visual Studio Code?
                            
                                Weighted percentile using numpy
                            
                                Normalizing images in OpenCV

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With