I have a pandas dataframe like the following:
categories review_count
0 [Burgers, Fast Food, Restaurants] 137
1 [Steakhouses, Restaurants] 176
2 [Food, Coffee & Tea, American (New), Restaurants] 390
... .... ...
... .... ...
... .... ...
From this dataFrame,I would like to extract only those rows wherein the list in the 'categories' column of that row contains the category 'Restaurants'. I have so far tried:
df[[df.categories.isin('Restaurants'),review_count]]
,
as I also have other columns in the dataFrame, I specified these two columns that I want to extract. But I get the error:
TypeError: unhashable type: 'list'
I don't have much idea what this error means as I am very new to pandas. Please let me know how I can achieve my goal of extracting only those rows from the dataFrame wherein the 'categories' column for that row has the string 'Restaurants' as part of the categories_list. Any help would be much appreciated.
Thanks in advance!
You can get the column names from pandas DataFrame using df. columns. values , and pass this to python list() function to get it as list, once you have the data you can print it using print() statement.
Insert List into Cell Using DataFrame.at() Method. In order to insert the list into the cell will use DataFrame.at() function. For example, I will use the Duration column from the above DataFrame to insert list. at() inserts a list into a specific cell without raising a ValueError.
The iloc() function in python is defined in the Pandas module, which helps us select a specific row or column from the data set. Using the iloc method in python, we can easily retrieve any particular value from a row or column by using index values.
I think you may have to use a lambda
function for this, since you can test whether a value in your column isin
some sequence, but pandas
doesn't seem to provide a function for testing whether the sequence in your column contains some value:
import pandas as pd
categories = [['fast_food', 'restaurant'], ['coffee', 'cafe'], ['burger', 'restaurant']]
counts = [137, 176, 390]
df = pd.DataFrame({'categories': categories, 'review_count': counts})
# Show which rows contain 'restaurant'
df.categories.map(lambda x: 'restaurant' in x)
# Subset the dataframe using this:
df[df.categories.map(lambda x: 'restaurant' in x)]
Output:
Out[11]:
categories review_count
0 [fast_food, restaurant] 137
2 [burger, restaurant] 390
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With