import pandas as pd
businesses = pd.read_json(businesses_filepath, lines=True, encoding='utf_8')
restaurantes = businesses['Restaurants' in businesses['categories']]
I would like to remove the lines that do not have Restaurants in the categories column, and this column has lists, however gave the error 'KeyError: False' and I would like to understand why and how to solve.
Since there is no ‘point’ column in our DataFrame, we receive a KeyError. The way to fix this error is to simply make sure we spell the column name correctly.
Applying an IF condition in Pandas DataFrame. Let’s now review the following 5 cases: (1) IF condition – Set of numbers. Suppose that you created a DataFrame in Python that has 10 numbers (from 1 to 10). You then want to apply the following IF conditions: If the number is equal or lower than 4, then assign the value of ‘True’
This error occurs when you attempt to access some column in a pandas DataFrame that does not exist. Typically this error occurs when you simply misspell a column names or include an accidental space before or after the column name. The following example shows how to fix this error in practice.
What a Python KeyError Usually Means A Python KeyError exception is what is raised when you try to access a key that isn’t in a dictionary (dict). Python’s official documentation says that the KeyError is raised when a mapping key is accessed and isn’t found in the mapping. A mapping is a data structure that maps one set of values to another.
If you find that your data contains spelling variations or alternative restaurant related terms, the following may be of benefit. Essentially you put your restaurant related terms in restuarant_lst
. The lambda
function returns true
if any of the items in restaurant_lst
are contained within each row of the business series. The .loc
indexer filters out rows which return false
for the lambda
function.
restaurant_lst = ['Restaurant','restaurantes','diner','bistro']
restaurant = businesses.loc[businesses.apply(lambda x: any(restaurant_str in x for restaurant_str in restaurant_lst))]
The expression 'Restaurants' in businesses['categories']
returns the boolean value False
. This is passed to the brackets indexing operator for the DataFrame businesses which does not contain a column called False and thus raises a KeyError.
What you are looking to do is something called boolean indexing which works like this.
businesses[businesses['categories'] == 'Restaurants']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With