I have to find the best way to create a new Dataframe using existing DataFrame.
Look at this link to have full code : jdoodle.com/a/xKP
I have this kind of DataFrame :
df = pd.DataFrame({'length': [112, 214, 52,88], 'views': [10000, 50000, 25000,5000], 'click': [55, 64, 85,9]},
index = ['id1', 'id2', 'id3','id4'])
click length views
id1 55 112 10000
id2 64 214 50000
id3 85 52 25000
id4 9 88 5000
And need to have this result :
type_stat stat
id1 click 55
id2 click 64
id3 click 85
id4 click 9
id1 length 112
id2 length 214
id3 length 52
id4 length 88
id1 views 10000
id2 views 50000
id3 views 25000
id4 views 5000
Currently, I create a function who return DataFrame with one stat :
def df_by_stat(current_df,stat):
current_df['type_stat'] = stat
current_df['stat'] = current_df[stat].astype(int)
return current_df[['type_stat','stat']]
After I make an .append
with the function like this :
def final():
return df_by_stat(df,'click').append(
df_by_stat(df,'length')).append(
df_by_stat(df,'views'))
print(final())
This way is working but its complexity depends on rows and columns cardinalities witch is too expensive. That's why I need your help to find a best method.
Using pandas.melt
after elevating your index to a series:
res = pd.melt(df.assign(index=df.index), id_vars='index',
value_name='stat', var_name='type_stat')\
.set_index('index')
print(res)
type_stat stat
index
id1 click 55
id2 click 64
id3 click 85
id4 click 9
id1 length 112
id2 length 214
id3 length 52
id4 length 88
id1 views 10000
id2 views 50000
id3 views 25000
id4 views 5000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With