What is the Pandas equivalent of this SQL code?
Select id, fname, lname from table where id = 123
I know that this is the equivalent of an SQL 'where' clause in Pandas:
df[df['id']==123]
And this selects specific columns:
df[['id','fname','lname']]
But I can't figure out how to combine them. All examples I've seen online select all columns with conditions. I want to select a limited number of columns with one or more conditions.
To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.
Selecting columns based on their name This is the most basic way to select a single column from a dataframe, just put the string name of the column in brackets. Returns a pandas series. Passing a list in the brackets lets you select multiple columns at the same time.
Use SQL-like .query()
method:
df.query("id == 123")[['id','fname','lname']]
or
df[['id','fname','lname']].query("id == 123")
or more "Pandaic":
df.loc[df['id'] == 123, ['id','fname','lname']]
Extending on @MaxU's answer, suppose you needed multiple column values, taking 'fname'
df[['id','fname','lname']].query("fname == ('simon', 'michael')")
Without using query method of @MaxU, for simplicity included all columns:
df[df.fname.isin(['simon', 'michael'])]
Cascading the above with [['id','fname','lname']] will give the needed answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With