Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

KeyError: False in pandas dataframe

Tags:

python

pandas

import pandas as pd

businesses = pd.read_json(businesses_filepath, lines=True, encoding='utf_8')
restaurantes = businesses['Restaurants' in businesses['categories']]

I would like to remove the lines that do not have Restaurants in the categories column, and this column has lists, however gave the error 'KeyError: False' and I would like to understand why and how to solve.

like image 886
panchester Avatar asked Jul 02 '17 21:07

panchester


People also ask

How do I fix a keyerror in a Dataframe?

Since there is no ‘point’ column in our DataFrame, we receive a KeyError. The way to fix this error is to simply make sure we spell the column name correctly.

How to apply an IF condition in pandas Dataframe?

Applying an IF condition in Pandas DataFrame. Let’s now review the following 5 cases: (1) IF condition – Set of numbers. Suppose that you created a DataFrame in Python that has 10 numbers (from 1 to 10). You then want to apply the following IF conditions: If the number is equal or lower than 4, then assign the value of ‘True’

Why is my Dataframe not opening in pandas?

This error occurs when you attempt to access some column in a pandas DataFrame that does not exist. Typically this error occurs when you simply misspell a column names or include an accidental space before or after the column name. The following example shows how to fix this error in practice.

What is a Python keyerror?

What a Python KeyError Usually Means A Python KeyError exception is what is raised when you try to access a key that isn’t in a dictionary (dict). Python’s official documentation says that the KeyError is raised when a mapping key is accessed and isn’t found in the mapping. A mapping is a data structure that maps one set of values to another.


Video Answer


2 Answers

If you find that your data contains spelling variations or alternative restaurant related terms, the following may be of benefit. Essentially you put your restaurant related terms in restuarant_lst. The lambda function returns true if any of the items in restaurant_lst are contained within each row of the business series. The .loc indexer filters out rows which return false for the lambda function.

restaurant_lst = ['Restaurant','restaurantes','diner','bistro']
restaurant = businesses.loc[businesses.apply(lambda x: any(restaurant_str in x for restaurant_str in restaurant_lst))]
like image 130
Joe Avatar answered Sep 19 '22 13:09

Joe


The expression 'Restaurants' in businesses['categories'] returns the boolean value False. This is passed to the brackets indexing operator for the DataFrame businesses which does not contain a column called False and thus raises a KeyError.

What you are looking to do is something called boolean indexing which works like this.

businesses[businesses['categories'] == 'Restaurants']
like image 28
Ted Petrou Avatar answered Sep 19 '22 13:09

Ted Petrou