Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read attributes of an object column, using Python Pandas

I have a dataframe where the 'location' column contains an object:

import pandas as pd

item1 = {
     'project': 'A',
     'location': {'country': 'united states', 'city': 'new york'},
     'raised_usd': 1.0}

item2 =  {
    'project': 'B',
    'location': {'country': 'united kingdom', 'city': 'cambridge'},
    'raised_usd': 5.0}

item3 =  {
    'project': 'C',
    'raised_usd': 10.0}

data = [item1, item2, item3]

df = pd.DataFrame(list(data))
df

enter image description here

I'd like to create an extra column, 'project_country', which contains just the country information, if available. I've tried the following:

def get_country(location):
    try:
        return location['country']
    except Exception:
        return 'n/a'

df['project_country'] = get_country(df['location'])
df

But this doesn't work: enter image description here

How should I go about importing this field?

like image 827
ninjaPixel Avatar asked Oct 26 '25 06:10

ninjaPixel


1 Answers

Use apply and pass your func to it:

In [62]:

def get_country(location):
    try:
        return location['country']
    except Exception:
        return 'n/a'
​
df['project_country'] = df['location'].apply(get_country)
df
Out[62]:
                                            location project  raised_usd  \
0   {'country': 'united states', 'city': 'new york'}       A           1   
1  {'country': 'united kingdom', 'city': 'cambrid...       B           5   
2                                                NaN       C          10   

  project_country  
0   united states  
1  united kingdom  
2             n/a 

The reason your original code failed is because what is passed is the entire column or pandas Series:

In [64]:

def get_country(location):
    print(location)
    try:
        print(location['country'])
    except Exception:
        print('n/a')
​
get_country(df['location'])
0     {'country': 'united states', 'city': 'new york'}
1    {'country': 'united kingdom', 'city': 'cambrid...
2                                                  NaN
Name: location, dtype: object
n/a

As such an attempt to find the key using the entire Series raises a KeyError and you get 'n/a' returned.

like image 193
EdChum Avatar answered Oct 28 '25 20:10

EdChum



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!