Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I split rows up by the number of times located in a column in R?

Tags:

split

dataframe

r

For example, suppose you have the following dataframe:

ID<-c("11", "12", "13", "14", "14")
Date<-c("2020-01-01", "2020-02-01", "2020-03-15", "2020-04-10", "2020-06-01")
Item<-c("Item1", "Item1", "Item2", "Item2", "Item2")
ItemPrice<-c(5, 5, 7, 7, 7)
Quantity<-c(1, 2, -2, 2, 3)
Cost<-c(5, 10, -14, 14, 21)
df<-data.frame(ID, Date, Item, ItemPrice, Quantity, Cost)
df

  ID       Date  Item ItemPrice Quantity Cost
1 11 2020-01-01 Item1         5        1    5
2 12 2020-02-01 Item1         5        2   10
3 13 2020-03-15 Item2         7       -2  -14
4 14 2020-04-10 Item2         7        2   14
5 14 2020-06-01 Item2         7        3   21

However, you wanted to separate the rows by Quantity so each one represents an individual sale like the following:

   ID       Date  Item ItemPrice Quantity Cost
1  11 2020-01-01 Item1         5        1    5
2  12 2020-02-01 Item1         5        1    5
3  12 2020-02-01 Item1         5        1    5
4  13 2020-03-15 Item2         7       -1   -7
5  13 2020-03-15 Item2         7       -1   -7
6  14 2020-04-10 Item2         7        1    7
7  14 2020-04-10 Item2         7        1    7
8  14 2020-06-01 Item2         7        1    7
9  14 2020-06-01 Item2         7        1    7
10 14 2020-06-01 Item2         7        1    7

How could this be achieved?

like image 912
GM01 Avatar asked Apr 02 '21 21:04

GM01


People also ask

How do I split data in a column in R?

To split a column into multiple columns in the R Language, we use the separator() function of the dplyr package library. The separate() function separates a character column into multiple columns with a regular expression or numeric locations.

How do you split a column based on values?

Split a column by number of characters Select the column you want to split. Ensure the column is a text data type. Select Home > Split Column > By Number of Characters.

How do I split a column into multiple rows?

Click in a cell, or select multiple cells that you want to split. Under Table Tools, on the Layout tab, in the Merge group, click Split Cells. Enter the number of columns or rows that you want to split the selected cells into.


1 Answers

Create a count column with the absolute value of 'Quantity', change the 'Quantity' to sign of it, replace the 'Cost' by dividing it with 'cnt' column, and then replicate the rows with 'cnt' column

library(dplyr)
library(tidyr)
df %>% 
   mutate(cnt = abs(Quantity), Quantity = sign(Quantity), 
         Cost = Cost/cnt) %>%
   uncount(cnt) %>%
   as_tibble

-output

# A tibble: 10 x 6
#   ID    Date       Item  ItemPrice Quantity  Cost
#   <chr> <chr>      <chr>     <dbl>    <dbl> <dbl>
# 1 11    2020-01-01 Item1         5        1     5
# 2 12    2020-02-01 Item1         5        1     5
# 3 12    2020-02-01 Item1         5        1     5
# 4 13    2020-03-15 Item2         7       -1    -7
# 5 13    2020-03-15 Item2         7       -1    -7
# 6 14    2020-04-10 Item2         7        1     7
# 7 14    2020-04-10 Item2         7        1     7
# 8 14    2020-06-01 Item2         7        1     7
# 9 14    2020-06-01 Item2         7        1     7
#10 14    2020-06-01 Item2         7        1     7
like image 53
akrun Avatar answered Sep 18 '22 14:09

akrun