I am trying to merge time-course data from different participants. I am iteratively extracting a dataframe per participant and concatenating them at the end of the loop. Before I concatenate, I would like to add the ID of my participants to an additional index.
This seems REALLY straightforward, but I was unable to find anything on this issue :(
I would like to turn this
col
0 1
1 1.1
2 NaN
Into:
col
ID 0 1
1 1.1
2 NaN
I know I could make a new index like:
multindex = [np.array(ID*len(data)),np.array(np.arange(len(data)))]
But that's inelegant without end, and - seeing as I am measuring with high frequency over half an hour - would even get kind of slow :/
I would like to mention that I have recently found my question to be a duplicate of this other question. However mine apparently has more upvotes and better answers. “Prepend” apparently doesn't seem to draw as many hits.
Maybe you can use keys
argument of concat
:
import numpy as np
import pandas as pd
df1 = pd.DataFrame(np.random.rand(3, 2))
df2 = pd.DataFrame(np.random.rand(4, 2))
df3 = pd.DataFrame(np.random.rand(5, 2))
print pd.concat([df1, df2, df3], keys=["A", "B", "C"])
output:
0 1
A 0 0.863774 0.794880
1 0.578503 0.418619
2 0.215317 0.146167
B 0 0.655829 0.116917
1 0.862316 0.812847
2 0.500126 0.689218
3 0.653439 0.270427
C 0 0.825213 0.882963
1 0.579436 0.332047
2 0.456948 0.718893
3 0.795074 0.826773
4 0.049676 0.697471
If you want to append other dataframes later:
df4 = pd.DataFrame(np.random.rand(6, 2))
pd.concat([df, pd.concat([df4], keys=["D"])])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With