Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fill pandas dataframe columns with random dictionary values

I'm new to Pandas and I would like to play with random text data. I am trying to add 2 new columns to a DataFrame df which would be each filled by a key (newcol1) + value (newcol2) randomly selected from a dictionary.

countries = {'Africa':'Ghana','Europe':'France','Europe':'Greece','Asia':'Vietnam','Europe':'Lithuania'}

My df already has 2 columns and I'd like something like this :

    Year Approved Continent    Country
0   2016      Yes    Africa      Ghana
1   2016      Yes    Europe  Lithuania
2   2017       No    Europe     Greece

I can certainly use a for or while loop to fill df['Continent'] and df['Country'] but I sense .apply() and np.random.choice may provide a simpler more pandorable solution for that.

like image 332
ozaarm Avatar asked Nov 23 '17 23:11

ozaarm


1 Answers

Yep, you're right. You can use np.random.choice with map:

df

    Year Approved
0   2016      Yes
1   2016      Yes
2   2017       No

df['Continent'] = np.random.choice(list(countries), len(df))
df['Country'] = df['Continent'].map(countries)

df

    Year Approved Continent    Country
0   2016      Yes    Africa      Ghana
1   2016      Yes      Asia    Vietnam
2   2017       No    Europe  Lithuania

You choose len(df) number of keys at random from the country key-list, and then use the country dictionary as a mapper to find the country equivalents of the previously picked keys.

like image 74
cs95 Avatar answered Sep 21 '22 15:09

cs95