I want to compare costs of CPT codes from two different claims payers. Both have par and non par priced providers. I am using <code>dplyr</code> and <code>modeest::mlv</code>, but its not working out as anticipated. Heres some sample data; <pre class="prettyprint"><code>source CPTCode ParNonPar Key net_paid PaidFreq seq ABC 100 Y ABC100Y -341.00 6 1 ABC 100 Y ABC100Y 0.00 2 2 ABC 100 Y ABC100Y 341.00 6 3 XYZ 103 Y XYZ103Y 740.28 1 1 XYZ 104 N XYZ104N 0.00 2 1 XYZ 104 N XYZ104N 401.82 1 2 XYZ 104 N XYZ104N 726.18 1 3 XYZ 104 N XYZ104N 893.00 1 4 XYZ 104 N XYZ104N 928.20 2 5 XYZ 104 N XYZ104N 940.00 2 6 </code></pre> and the code <pre class="prettyprint"><code>str(data) View(data) ## Expand frequency count to individual observations n.times <- data$PaidAmounts dataObs <- data[rep(seq_len(nrow(data)), n.times),] ## Calculate mean for each CPTCode (for mode use modeest library) library(dplyr) library(modeest) dataSummary <- dataObs %>% group_by(ParNonPar, CPTCode) %>% summarise(mean = mean(net_paid), median=median(net_paid), mode = mlv(net_paid, method=mfv), total = sum(net_paid)) str(dataSummary) </code></pre> I thought I could load modeest in the summarize function with the mean and median, but this formulation errors out with Error in as.character(x) : cannot coerce type 'closure' to vector of type 'character' Without mlv I am getting a df like this, but what I want is to get all the stats for a payer cpt on one line. I envision graphing it in boxplots by limiting the x and y segments, once I get what I need on a row the inadequate answer is this ( I forgot to get the payer name in here!) <pre class="prettyprint"><code>ParNonPar CPTCode mean median(net_paid) total N 0513F 0.000000 0.000 0.00 N 0518F 0.000000 0.000 0.00 N 10022 0.000000 0.000 0.00 N 10060 73.660000 90.120 294.64 N 10061 324.575000 340.500 1298.30 N 10081 312.000000 312.000 312.00 thanks very much for your time and effort. </code></pre>

I use this approach: <pre class="prettyprint"><code>df <- data.frame(groups = c("A", "A", "A", "B", "B", "C", "C", "C", "D"), nums = c("1", "2", "1", "2", "3", "4", "5", "5", "1")) </code></pre> which looks like: <pre class="prettyprint"><code> groups nums A 1 A 2 A 1 B 2 B 3 C 4 C 5 C 5 D 1 </code></pre> Then I define: <pre class="prettyprint"><code>mode <- function(codes){ which.max(tabulate(codes)) } </code></pre> and do the following: <pre class="prettyprint"><code>mds <- df %>% group_by(groups) %>% summarise(mode = mode(nums)) </code></pre> giving: <pre class="prettyprint"><code> groups mode A 1 B 2 C 5 D 1 </code></pre>

You need to make a couple of changes to your code for mlv to work. <ol> <li>the method (mfv) has to be within quotes ('mfv'). That is what is causing your error.</li> <li>After you do that, since mlv returns a list, you have to feed one value to summarise(). Assuming that you want the mode ('M'), you pick that element from the list.</li> </ol> Try: <pre class="prettyprint"><code>dataSummary <- dataObs %>% group_by(ParNonPar, CPTCode) %>% summarise(mean = mean(net_paid), meadian=median(net_paid), mode = mlv(net_paid, method='mfv')[['M']], total = sum(net_paid)) </code></pre> to get: <pre class="prettyprint"><code>> dataSummary Source: local data frame [3 x 6] Groups: ParNonPar ParNonPar CPTCode mean meadian mode total 1 N 104 639.7111 893.00 622.7333 5757.40 2 Y 100 0.0000 0.00 0.0000 0.00 3 Y 103 740.2800 740.28 740.2800 740.28 </code></pre> Hope that helps you move forward.

How to get the mode of a group in summarize in R

Tags:

r

dplyr

mode

statistics

I want to compare costs of CPT codes from two different claims payers. Both have par and non par priced providers. I am using dplyr and modeest::mlv, but its not working out as anticipated. Heres some sample data;

source CPTCode ParNonPar Key         net_paid  PaidFreq seq
ABC   100       Y      ABC100Y  -341.00     6   1
ABC   100       Y      ABC100Y     0.00     2   2
ABC   100       Y      ABC100Y   341.00     6   3
XYZ   103       Y      XYZ103Y   740.28     1   1
XYZ   104       N      XYZ104N     0.00     2   1
XYZ   104       N      XYZ104N   401.82     1   2
XYZ   104       N      XYZ104N   726.18     1   3
XYZ   104       N      XYZ104N   893.00     1   4
XYZ   104       N      XYZ104N   928.20     2   5
XYZ   104       N      XYZ104N   940.00     2   6

and the code

str(data)
View(data)

## Expand frequency count to individual observations
n.times <- data$PaidAmounts
dataObs <- data[rep(seq_len(nrow(data)), n.times),]

## Calculate mean for each CPTCode (for mode use modeest library)
library(dplyr)
library(modeest)
dataSummary <- dataObs %>%
  group_by(ParNonPar, CPTCode) %>%
  summarise(mean = mean(net_paid),
            median=median(net_paid),
            mode = mlv(net_paid, method=mfv),
            total = sum(net_paid))
str(dataSummary)

I thought I could load modeest in the summarize function with the mean and median, but this formulation errors out with Error in as.character(x) : cannot coerce type 'closure' to vector of type 'character' Without mlv I am getting a df like this, but what I want is to get all the stats for a payer cpt on one line. I envision graphing it in boxplots by limiting the x and y segments, once I get what I need on a row

the inadequate answer is this ( I forgot to get the payer name in here!)

ParNonPar   CPTCode mean          median(net_paid)  total
N           0513F   0.000000    0.000           0.00
N           0518F   0.000000    0.000           0.00 
N           10022   0.000000    0.000           0.00
N           10060   73.660000   90.120        294.64
N           10061   324.575000  340.500      1298.30
N           10081   312.000000  312.000       312.00

thanks very much for your time and effort.

434

asked May 21 '15 22:05

drew

2 Answers

I use this approach:

df <- data.frame(groups = c("A", "A", "A", "B", "B", "C", "C", "C", "D"), nums = c("1", "2", "1", "2", "3", "4", "5", "5", "1"))

which looks like:

 groups nums
  A    1
  A    2
  A    1
  B    2
  B    3
  C    4
  C    5
  C    5
  D    1

Then I define:

mode <- function(codes){
  which.max(tabulate(codes))
}

and do the following:

mds <- df %>%
  group_by(groups) %>%
  summarise(mode = mode(nums))

giving:

  groups  mode
 A          1
 B          2
 C          5
 D          1

166

answered Sep 29 '22 12:09

orrymr

You need to make a couple of changes to your code for mlv to work.

the method (mfv) has to be within quotes ('mfv'). That is what is causing your error.
After you do that, since mlv returns a list, you have to feed one value to summarise(). Assuming that you want the mode ('M'), you pick that element from the list.

Try:

dataSummary <- dataObs %>%
  group_by(ParNonPar, CPTCode) %>%
  summarise(mean = mean(net_paid), 
            meadian=median(net_paid), 
            mode = mlv(net_paid, method='mfv')[['M']], 
            total = sum(net_paid))

to get:

> dataSummary
Source: local data frame [3 x 6]
Groups: ParNonPar

  ParNonPar CPTCode     mean meadian     mode   total
1         N     104 639.7111  893.00 622.7333 5757.40
2         Y     100   0.0000    0.00   0.0000    0.00
3         Y     103 740.2800  740.28 740.2800  740.28

Hope that helps you move forward.

answered Sep 29 '22 10:09

Ram Narasimhan

Related questions
                            
                                R caret nnet package in Multicore
                            
                                Use of randomforest() for classification in R?
                            
                                Shiny runExample Error - Fail to create server
                            
                                How do I write a csv file in R, where my input is written to the file as row?
                            
                                Best way to use c++ code from R package FOO in package BAR
                            
                                How to justify text axis labels in R ggplot
                            
                                Accessing grouped data in dplyr
                            
                                update() a model inside a function with local covariate
                            
                                R count NA by group
                            
                                Arrange multiple (32) .png files in a grid
                            
                                Extract all maximum length values in a character vector in R
                            
                                R - how to make barplot plot zeros for missing values over the data range?
                            
                                Have nomatch return value as-is using match function in R
                            
                                R: Converting from string to double
                            
                                data.table subsetting rows using a logical column: why do I have to explicitly compare with TRUE? [duplicate]
                            
                                How to insert values from a vector diagonally into a matrix in R?
                            
                                Time Series Breakout/Change/Disturbance Detection in R: strucchange, changepoint, BreakoutDetection, bfast, and more
                            
                                What are "reverse dependencies" in R?
                            
                                How to create a random matching between the rows of two data.tables (or data.frames)
                            
                                Most elegant way to load csv with point as thousands separator in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With