Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

repeating the rows of a data frame

I'm trying repeat the rows of a dataframe. Here's my original data:

pd.DataFrame([
        {'col1': 1, 'col2': 11, 'col3': [1, 2] },
        {'col1': 2, 'col2': 22, 'col3': [1, 2, 3] },
        {'col1': 3, 'col2': 33, 'col3': [1] },
        {'col1': 4, 'col2': 44, 'col3': [1, 2, 3, 4] },
    ])

which gives me

   col1  col2          col3
0     1    11        [1, 2]
1     2    22     [1, 2, 3]
2     3    33           [1]
3     4    44  [1, 2, 3, 4]

I'd like to repeat the rows depending on the length of the array in col3 i.e. I'd like to get a dataframe like this one.

   col1  col2
0     1    11
1     1    11
2     2    22
3     2    22
4     2    22
5     3    33
6     4    44
7     4    44
8     4    44
9     4    44

What's a good way accomplishing this?

like image 930
zinyosrim Avatar asked Sep 16 '18 08:09

zinyosrim


People also ask

How do you duplicate rows and times?

In the Copy and insert rows & columns dialog box, select Copy and insert rows option in the Type section, then select the data range you want to duplicate, and then specify the repeat time to duplicate the rows, see screenshot: 4.

How do you repeat a row multiple times in Python?

In Python, if you want to repeat the elements multiple times in the NumPy array then you can use the numpy. repeat() function. In Python, this method is available in the NumPy module and this function is used to return the numpy array of the repeated items along with axis such as 0 and 1.

How do you replicate data frames?

Method 1 : Using replicate() method The rbind() method is taken as the first argument of this method to combine data frames together. The second argument is the replicate() method which is used to create multiple copies of the rows of the data frames equivalent to the number of times same as replication factor.


1 Answers

You can also use reindex and index.repeat

df = df.reindex(df.index.repeat(df.col3.apply(len)))

df = df.reset_index(drop=True).drop("col3", axis=1)
# To reset index and drop col3 

# Output:

   col1  col2
0   1     11
1   1     11
2   2     22
3   2     22
4   2     22
5   3     33
6   4     44
7   4     44
8   4     44
9   4     44
like image 108
Abhi Avatar answered Oct 06 '22 00:10

Abhi