Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert pandas Series to DataFrame

I have a Pandas series sf:

email [email protected]    [1.0, 0.0, 0.0] [email protected]    [2.0, 0.0, 0.0] [email protected]    [1.0, 0.0, 0.0] [email protected]    [4.0, 0.0, 0.0] [email protected]    [1.0, 0.0, 3.0] [email protected]    [1.0, 5.0, 0.0] 

And I would like to transform it to the following DataFrame:

index | email             | list _____________________________________________ 0     | [email protected]  | [1.0, 0.0, 0.0] 1     | [email protected]  | [2.0, 0.0, 0.0] 2     | [email protected]  | [1.0, 0.0, 0.0] 3     | [email protected]  | [4.0, 0.0, 0.0] 4     | [email protected]  | [1.0, 0.0, 3.0] 5     | [email protected]  | [1.0, 5.0, 0.0] 

I found a way to do it, but I doubt it's the more efficient one:

df1 = pd.DataFrame(data=sf.index, columns=['email']) df2 = pd.DataFrame(data=sf.values, columns=['list']) df = pd.merge(df1, df2, left_index=True, right_index=True) 
like image 887
woshitom Avatar asked Sep 29 '14 10:09

woshitom


People also ask

Which statements can be used to convert a pandas Series DS to a pandas DataFrame DF?

to_frame() function is used to convert the given series object to a dataframe.

Can we create DataFrame from series?

You can create a DataFrame from multiple Series objects by adding each series as a columns. By using concat() method you can merge multiple series together into DataFrame.

Is a pandas Series A DataFrame?

The Pandas Series data structure is a one-dimensional labelled array. It is the primary building block for a DataFrame, making up its rows and columns.


2 Answers

Rather than create 2 temporary dfs you can just pass these as params within a dict using the DataFrame constructor:

pd.DataFrame({'email':sf.index, 'list':sf.values}) 

There are lots of ways to construct a df, see the docs

like image 129
EdChum Avatar answered Sep 27 '22 18:09

EdChum


to_frame():

Starting with the following Series, df:

email [email protected]    A [email protected]    B [email protected]    C dtype: int64 

I use to_frame to convert the series to DataFrame:

df = df.to_frame().reset_index()      email               0 0   [email protected]    A 1   [email protected]    B 2   [email protected]    C 3   [email protected]    D 

Now all you need is to rename the column name and name the index column:

df = df.rename(columns= {0: 'list'}) df.index.name = 'index' 

Your DataFrame is ready for further analysis.

Update: I just came across this link where the answers are surprisingly similar to mine here.

like image 35
Shoresh Avatar answered Sep 27 '22 19:09

Shoresh