i have a table in my pandas dataframe. df <pre class="prettyprint"><code>id count price 1 2 100 2 7 25 3 3 720 4 7 221 5 8 212 6 2 200 </code></pre> i want to create a new dataframe(df2) from this, selecting rows where count is 2 and price is 100,and count is 7 and price is 221 my output should be df2 = <pre class="prettyprint"><code>id count price 1 2 100 4 7 221 </code></pre> i am trying using <code>df[df['count'] == '2' & df['price'] == '100']</code> but getting error <pre class="prettyprint"><code>TypeError: cannot compare a dtyped [object] array with a scalar of type [bool] </code></pre>

You nedd add <code>()</code> because <code>&</code> has higher precedence than <code>==</code>: <pre class="prettyprint"><code>df3 = df[(df['count'] == '2') & (df['price'] == '100')] print (df3) id count price 0 1 2 100 </code></pre> If need check multiple values use <code>isin</code>: <pre class="prettyprint"><code>df4 = df[(df['count'].isin(['2','7'])) & (df['price'].isin(['100', '221']))] print (df4) id count price 0 1 2 100 3 4 7 221 </code></pre> But if check numeric, use: <pre class="prettyprint"><code>df3 = df[(df['count'] == 2) & (df['price'] == 100)] print (df3) df4 = df[(df['count'].isin([2,7])) & (df['price'].isin([100, 221]))] print (df4) </code></pre>

create a new dataframe from selecting specific rows from existing dataframe python

Tags:

python

pandas

i have a table in my pandas dataframe. df

id count price
1    2     100
2    7      25
3    3     720
4    7     221
5    8     212
6    2     200

i want to create a new dataframe(df2) from this, selecting rows where count is 2 and price is 100,and count is 7 and price is 221

my output should be df2 =

id count price
1    2     100
4    7     221

i am trying using df[df['count'] == '2' & df['price'] == '100']

but getting error

TypeError: cannot compare a dtyped [object] array with a scalar of type [bool]

377

asked Nov 30 '16 10:11

Shubham R

1 Answers

You nedd add () because & has higher precedence than ==:

df3 = df[(df['count'] == '2') & (df['price'] == '100')]
print (df3)
  id count price
0  1     2   100

If need check multiple values use isin:

df4 = df[(df['count'].isin(['2','7'])) & (df['price'].isin(['100', '221']))]
print (df4)
  id count price
0  1     2   100
3  4     7   221

But if check numeric, use:

df3 = df[(df['count'] == 2) & (df['price'] == 100)]
print (df3)

df4 = df[(df['count'].isin([2,7])) & (df['price'].isin([100, 221]))]
print (df4)

172

answered Oct 08 '22 00:10

jezrael

Related questions
                            
                                Error using pelican-quickstart "No module named html_parser"
                            
                                Installing nltk data dependencies in setup.py script
                            
                                How to append a file to a tar file use python tarfile module?
                            
                                What does the"wait_window" method do?
                            
                                What's the difference between tkinter's Tk and Toplevel classes?
                            
                                Python: one single module (file .py) for each class? [closed]
                            
                                What is the difference between Python's __add__ and __concat__?
                            
                                sklearn classifier get ValueError: bad input shape
                            
                                Set size of matplotlib figure with 3d subplots
                            
                                Why do people default owner parameter to None in __get__?
                            
                                Pandas DataFrame - Combining one column's values with same index into list
                            
                                Saving a cross-validation trained model in Scikit
                            
                                python requests upload large file with additional data
                            
                                Jupyter notebook does not print logs to the output cell
                            
                                How int() object uses "==" operator without __eq__() method in python2?
                            
                                What is the default variable initializer in Tensorflow?
                            
                                Cannot convert string to float in pandas (ValueError)
                            
                                How to document multiple return values using reStructuredText in Python 2?
                            
                                How am I supposed to register a package to PyPI?
                            
                                value error in python statsmodels.tsa.seasonal

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With