I have the below conceptual problem which I can't get my head around.
Below is an example for survey data where I have a time column that indicates how long someone needs to respond to a certain question.
Now, I'm interested in how the amount of cleaning would change based on this threshold, i.e. what would happen if I increase the threshold, what would happen if I decrease it.
So my idea was to just create a ROC curve (or other model metrics) to have a visual cue about a potential threshold. The problem is that I don't have a machine-learning-like model that would give me class probabilities. So I was wondering if there's any way to create a ROC curve nonetheless with this type of data. I had the idea of just looping through my data at maybe 100 different thresholds, calculate false and true positive rates at each threshold and then do a simple line plot, but I was hoping for a more elegant solution that doesn't require me to loop.
Any ideas?
example data:
time column indidates the time needed per casetruth column indicates my current decision I want to compare againstpredicted column indicates the cleaning decision if I would cut at a time threshold of 2.5s. This is waht I need to change/loop through.set.seed(3)
df <- data.frame(time = c(2.5 + rnorm(5), 3.5 + rnorm(5)),
truth = rep(c("cleaned", "final"), each = 5)) %>%
mutate(predicted = if_else(time < 2.5, "cleaned", "final"))
You can use ROCR too for this
library(ROCR)
set.seed(3)
df <- data.frame(time = c(2.5 + rnorm(5), 3.5 + rnorm(5)),
truth = rep(c("cleaned", "final"), each = 5)) %>%
mutate(predicted = if_else(time < 2.5, "cleaned", "final"))
pred <- prediction(df$time, df$truth)
perf <- performance(pred,"tpr","fpr")
plot(perf,colorize=TRUE)

You can also check the AUC value:
auc <- performance(pred, measure = "auc")
[email protected][[1]]
[1] 0.92
Cross checking the AUC value with pROC
library(pROC)
roc(df$truth, df$time)
Call:
roc.default(response = df$truth, predictor = df$time)
Data: df$time in 5 controls (df$truth cleaned) < 5 cases (df$truth final).
Area under the curve: 0.92
For both the cases, it is same!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With