Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cumulative Column Sum To Inform Another Column Value in R

Tags:

r

I have a dataframe (pricedata), that contains transaction data for customers. I arrange() the dataframe by TNP for each customer from highest to lowest. I want to move down from the highest TNP, accumulating the TNP until I reach 60% of the total TNP for the entire column and give all those customers a class = "A". Then continue down the column and give the subsequent customers between 60% and 90% of column total TNP a class = "B", and then the remainder a class= "C".

This is the data:

   TNP      Class
 11847.47     C
 11845.76     C
 11840.06     C
 11814.44     C
 11775.24     C
 11766.90     C

This is the function I have built to achieve my goal:

FUN.sthp.1 <- function(TNP) { 
  #The dataframe must be ordered by TNP for this cumsum function to work
  Acut<-(.6*sum(pricedata$TNP))
  Bcut<-(.9*sum(pricedata$TNP))
  Ccut<-(1*sum(pricedata$TNP))

  Summary.DF<- pricedata[pricedata$TNP>=TNP,]
  total<-sum(Summary.DF$TNP)

  Class<-ifelse(total<=Acut, "A", 
              ifelse(total<=Bcut & total>Acut, "B",
                     "C"))
  return(Class)
}

I then call this function using mapply() and feed the TNP.

Is there a function, or a feature within a function, that will achieve the same thing without all the typing?

like image 582
Sean D Philippe Avatar asked Dec 06 '25 03:12

Sean D Philippe


1 Answers

You can use cumsum to get the cumulative sum. Divide this by the total sum. Then you can assign levels using cut. There are options to deal with the margins (what happens at 60% etc.) you can see the help file using ?cut

"TNP      Class
11847.47     C
11845.76     C
11840.06     C
11814.44     C
11775.24     C
11766.90     C" -> myDat
out <- read.table(text = myDat, stringsAsFactors = FALSE, header = TRUE)
cut(cumsum(out$TNP)/sum(out$TNP), breaks =c(0,60,90,100)/100, labels = LETTERS[1:3])
[1] A A A B B C
like image 185
jdharrison Avatar answered Dec 08 '25 16:12

jdharrison