I'm new to Pandas and I would like to play with random text data. I am trying to add 2 new columns to a DataFrame df which would be each filled by a key (newcol1) + value (newcol2) randomly selected from a dictionary.
countries = {'Africa':'Ghana','Europe':'France','Europe':'Greece','Asia':'Vietnam','Europe':'Lithuania'}
My df already has 2 columns and I'd like something like this :
Year Approved Continent Country
0 2016 Yes Africa Ghana
1 2016 Yes Europe Lithuania
2 2017 No Europe Greece
I can certainly use a for or while loop to fill df['Continent'] and df['Country'] but I sense .apply() and np.random.choice may provide a simpler more pandorable solution for that.
Yep, you're right. You can use np.random.choice
with map
:
df
Year Approved
0 2016 Yes
1 2016 Yes
2 2017 No
df['Continent'] = np.random.choice(list(countries), len(df))
df['Country'] = df['Continent'].map(countries)
df
Year Approved Continent Country
0 2016 Yes Africa Ghana
1 2016 Yes Asia Vietnam
2 2017 No Europe Lithuania
You choose len(df)
number of keys at random from the country
key-list, and then use the country
dictionary as a mapper to find the country equivalents of the previously picked keys.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With