Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Duplicate row based on value in different column

Tags:

I have a dataframe of transactions. Each row represents a transaction of two item (think of it like a transaction of 2 event tickets or something). I want to duplicate each row based on the quantity sold.

Here's example code:

# dictionary of transactions  d = {     '1': ['20',  'NYC', '2'],     '2': ['30',  'NYC', '2'],     '3': ['5',   'NYC', '2'],     '4': ['300', 'LA',  '2'],     '5': ['30',  'LA',  '2'],     '6': ['100', 'LA',  '2'] }  columns=['Price', 'City', 'Quantity']  # create dataframe and rename columns  df = pd.DataFrame.from_dict(     data=d, orient='index' ) df.columns = columns 

This produces a dataframe that looks like this

Price   City    Quantity 20       NYC         2 30       NYC         2 5        NYC         2 300      LA          2 30       LA          2 100      LA          2 

So in the case above, each row will transform into two duplicate rows. If the 'quantity' column was 3, then that row would transform into three duplicate rows.

like image 418
MRA Avatar asked Sep 26 '15 00:09

MRA


People also ask

How do you duplicate specific rows in Excel?

Right-click a row or column below or to the right of where you want to move or copy your selection, and then do one of the following: When you are moving rows or columns, click Insert Cut Cells. When you are copying rows or columns, click Insert Copied Cells.


1 Answers

Answer by using repeat

df.loc[df.index.repeat(df.Quantity)] Out[448]:    Price City Quantity 1    20  NYC        2 1    20  NYC        2 2    30  NYC        2 2    30  NYC        2 3     5  NYC        2 3     5  NYC        2 4   300   LA        2 4   300   LA        2 5    30   LA        2 5    30   LA        2 6   100   LA        2 6   100   LA        2 
like image 95
BENY Avatar answered Sep 25 '22 21:09

BENY