Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

create a new dataframe from selecting specific rows from existing dataframe python

Tags:

python

pandas

i have a table in my pandas dataframe. df

id count price
1    2     100
2    7      25
3    3     720
4    7     221
5    8     212
6    2     200

i want to create a new dataframe(df2) from this, selecting rows where count is 2 and price is 100,and count is 7 and price is 221

my output should be df2 =

id count price
1    2     100
4    7     221

i am trying using df[df['count'] == '2' & df['price'] == '100']

but getting error

TypeError: cannot compare a dtyped [object] array with a scalar of type [bool]
like image 377
Shubham R Avatar asked Nov 30 '16 10:11

Shubham R


People also ask

How do you create a new DataFrame from an existing DataFrame in Python?

You can create a new DataFrame of a specific column by using DataFrame. assign() method. The assign() method assign new columns to a DataFrame, returning a new object (a copy) with the new columns added to the original ones.

How do I create a new DataFrame from an existing DataFrame in PySpark?

To create a PySpark DataFrame from an existing RDD, we will first create an RDD using the . parallelize() method and then convert it into a PySpark DataFrame using the . createDatFrame() method of SparkSession.


1 Answers

You nedd add () because & has higher precedence than ==:

df3 = df[(df['count'] == '2') & (df['price'] == '100')]
print (df3)
  id count price
0  1     2   100

If need check multiple values use isin:

df4 = df[(df['count'].isin(['2','7'])) & (df['price'].isin(['100', '221']))]
print (df4)
  id count price
0  1     2   100
3  4     7   221

But if check numeric, use:

df3 = df[(df['count'] == 2) & (df['price'] == 100)]
print (df3)

df4 = df[(df['count'].isin([2,7])) & (df['price'].isin([100, 221]))]
print (df4)
like image 172
jezrael Avatar answered Oct 08 '22 00:10

jezrael