I'm trying repeat the rows of a dataframe. Here's my original data:
pd.DataFrame([
{'col1': 1, 'col2': 11, 'col3': [1, 2] },
{'col1': 2, 'col2': 22, 'col3': [1, 2, 3] },
{'col1': 3, 'col2': 33, 'col3': [1] },
{'col1': 4, 'col2': 44, 'col3': [1, 2, 3, 4] },
])
which gives me
col1 col2 col3
0 1 11 [1, 2]
1 2 22 [1, 2, 3]
2 3 33 [1]
3 4 44 [1, 2, 3, 4]
I'd like to repeat the rows depending on the length of the array in col3 i.e. I'd like to get a dataframe like this one.
col1 col2
0 1 11
1 1 11
2 2 22
3 2 22
4 2 22
5 3 33
6 4 44
7 4 44
8 4 44
9 4 44
What's a good way accomplishing this?
In the Copy and insert rows & columns dialog box, select Copy and insert rows option in the Type section, then select the data range you want to duplicate, and then specify the repeat time to duplicate the rows, see screenshot: 4.
In Python, if you want to repeat the elements multiple times in the NumPy array then you can use the numpy. repeat() function. In Python, this method is available in the NumPy module and this function is used to return the numpy array of the repeated items along with axis such as 0 and 1.
Method 1 : Using replicate() method The rbind() method is taken as the first argument of this method to combine data frames together. The second argument is the replicate() method which is used to create multiple copies of the rows of the data frames equivalent to the number of times same as replication factor.
You can also use reindex
and index.repeat
df = df.reindex(df.index.repeat(df.col3.apply(len)))
df = df.reset_index(drop=True).drop("col3", axis=1)
# To reset index and drop col3
# Output:
col1 col2
0 1 11
1 1 11
2 2 22
3 2 22
4 2 22
5 3 33
6 4 44
7 4 44
8 4 44
9 4 44
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With