Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create a Dataframe from a Series and a String

Suppose I have a dataframe with columns with Strings, Series and Integers that I would like to combine into a new dataframe with the String and the Integer combined with every entry in the Series. How could I go about it?

Given this example:

data = {'fruits': ['banana', 'apple', 'pear'], 
    'source' : (['brazil', 'algeria', 'nigera'], ['brazil', 'morocco', 'iran', 'france'], ['china', 'india', 'mexico']),
    'prices' : [2, 3, 7]}
df = pd.DataFrame(data, columns = ['fruits', 'source', 'prices'])

I would like to get a 3x10 dataframe with;

['banana', 'banana', 'banana', 'apple', 'apple', 'apple', 'apple', 'pear', 'pear', 'pear'],
['brazil', 'algeria', 'nigera', 'brazil', 'morocco', 'iran', 'france', 'china', 'india', 'mexico'],
['2', '2', '2', '3', '3', '3', '3', '7', '7', '7'],

I guess it shouldn't be too complex but I can't find a neat solutions.

like image 346
mkcz Avatar asked Dec 13 '17 17:12

mkcz


1 Answers

Use explode() function:

In [30]: explode(df, lst_cols='source')
Out[30]:
   fruits   source  prices
0  banana   brazil       2
1  banana  algeria       2
2  banana   nigera       2
3   apple   brazil       3
4   apple  morocco       3
5   apple     iran       3
6   apple   france       3
7    pear    china       7
8    pear    india       7
9    pear   mexico       7
like image 135
MaxU - stop WAR against UA Avatar answered Oct 02 '22 02:10

MaxU - stop WAR against UA