I have a Series object (let's call this MySeries
) which contains a list of integers.
I also have a separate dataframe (say MyDataFrame
), which includes a column/field called MyField
.
I want to select all records from MyDataFrame
where the value in MyField
is in MySeries
The equivalent SQL would be:
Select * from MyDataFrame
where MyField in
(select * from MySeries)
Could anyone suggest the best way to do this?
Thanks very much for any help.
at is a single element and using . loc maybe a Series or a DataFrame. Returning single value is not the case always. It returns array of values if the provided index is used multiple times.
The query function seams more efficient than the loc function. DF2: 2K records x 6 columns. The loc function seams much more efficient than the query function.
The SELECT statement is used to select columns of data from a table. To do the same thing in pandas we just have to use the array notation on the data frame and inside the square brackets pass a list with the column names you want to select. The SELECT DISTINCT statement returns only unique rows form a table.
you can use isin() function:
>>> df = pd.DataFrame({'A':[1,2,3,4,5], 'B':list('ABCDE')})
>>> f = pd.Series([1,2])
>>> df[df['A'].isin(f)]
A B
0 1 A
1 2 B
so, first you get fiter Series:
>>> df['A'].isin(f)
0 True
1 True
2 False
3 False
4 False
And then use it to filter your DataFrame
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With