Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if a column exists in Pandas

Is there a way to check if a column exists in a Pandas DataFrame?

Suppose that I have the following DataFrame:

>>> import pandas as pd >>> from random import randint >>> df = pd.DataFrame({'A': [randint(1, 9) for x in xrange(10)],                        'B': [randint(1, 9)*10 for x in xrange(10)],                        'C': [randint(1, 9)*100 for x in xrange(10)]}) >>> df    A   B    C 0  3  40  100 1  6  30  200 2  7  70  800 3  3  50  200 4  7  50  400 5  4  10  400 6  3  70  500 7  8  30  200 8  3  40  800 9  6  60  200 

and I want to calculate df['sum'] = df['A'] + df['C']

But first I want to check if df['A'] exists, and if not, I want to calculate df['sum'] = df['B'] + df['C'] instead.

like image 394
npires Avatar asked Jul 21 '14 16:07

npires


People also ask

How do I find columns in pandas?

You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.

How do you check if a key exists in a DataFrame python?

To check if a value exists in the Index of a Pandas DataFrame, use the in keyword on the index property.

How can I see column names in pandas?

You can get column names in Pandas dataframe using df. columns statement. Usecase: This is useful when you want to show all columns in a dataframe in the output console (E.g. in the jupyter notebook console).


2 Answers

This will work:

if 'A' in df: 

But for clarity, I'd probably write it as:

if 'A' in df.columns: 
like image 69
chrisb Avatar answered Oct 13 '22 11:10

chrisb


To check if one or more columns all exist, you can use set.issubset, as in:

if set(['A','C']).issubset(df.columns):    df['sum'] = df['A'] + df['C']                 

As @brianpck points out in a comment, set([]) can alternatively be constructed with curly braces,

if {'A', 'C'}.issubset(df.columns): 

See this question for a discussion of the curly-braces syntax.

Or, you can use a generator comprehension, as in:

if all(item in df.columns for item in ['A','C']): 
like image 20
C8H10N4O2 Avatar answered Oct 13 '22 12:10

C8H10N4O2