Let's say I have a response variable that rises and falls over time. Each time the response variable rises above a threshold, we have a new "Trial." That is, if I add a column <code>Threshold</code> that is <code>TRUE</code> whenever above a certain value, consecutive blocks of data points where <code>Threshold</code> is <code>TRUE</code> constitute a new trial. <pre class="prettyprint"><code>Time <- seq(1, 10, by = 0.5) Response <- abs(sin(Time)) Threshold <- Response > 0.6 data <- data.frame(Time, Response, Threshold) </code></pre> Given <code>Time</code>, <code>Response</code>, and <code>Threshold</code>, how could I go about adding a <code>Trial</code> factor that has a new value for each group of <code>TRUE</code> thresholds? Something like this: <pre class="prettyprint"><code> Time Response Threshold Trial 1 1.0 0.84147098 TRUE A 2 1.5 0.99749499 TRUE A 3 2.0 0.90929743 TRUE A 4 2.5 0.59847214 FALSE NA 5 3.0 0.14112001 FALSE NA 6 3.5 0.35078323 FALSE NA 7 4.0 0.75680250 TRUE B 8 4.5 0.97753012 TRUE B 9 5.0 0.95892427 TRUE B 10 5.5 0.70554033 TRUE B 11 6.0 0.27941550 FALSE NA 12 6.5 0.21511999 FALSE NA 13 7.0 0.65698660 TRUE C 14 7.5 0.93799998 TRUE C 15 8.0 0.98935825 TRUE C 16 8.5 0.79848711 TRUE C 17 9.0 0.41211849 FALSE NA 18 9.5 0.07515112 FALSE NA 19 10.0 0.54402111 FALSE NA </code></pre>

<pre class="prettyprint"><code>data$Trial <- factor( ifelse(data$Threshold, cumsum(!data$Threshold), NA), labels = c("A", "B", "C") ) ## Time Response Threshold Trial ## 1 1.0 0.84147098 TRUE A ## 2 1.5 0.99749499 TRUE A ## 3 2.0 0.90929743 TRUE A ## 4 2.5 0.59847214 FALSE <NA> ## 5 3.0 0.14112001 FALSE <NA> ## 6 3.5 0.35078323 FALSE <NA> ## 7 4.0 0.75680250 TRUE B ## 8 4.5 0.97753012 TRUE B ## 9 5.0 0.95892427 TRUE B ## 10 5.5 0.70554033 TRUE B ## 11 6.0 0.27941550 FALSE <NA> ## 12 6.5 0.21511999 FALSE <NA> ## 13 7.0 0.65698660 TRUE C ## 14 7.5 0.93799998 TRUE C ## 15 8.0 0.98935825 TRUE C ## 16 8.5 0.79848711 TRUE C ## 17 9.0 0.41211849 FALSE <NA> ## 18 9.5 0.07515112 FALSE <NA> ## 19 10.0 0.54402111 FALSE <NA> </code></pre>

Another possibility using <code>rle</code>: <pre class="prettyprint"><code>r <- with(data, rle(Threshold)) len <- with(r, lengths[values]) n <- length(len) trial <- rep(x = LETTERS[1:n], times = len) data$Trial[data$Threshold] <- trial data </code></pre>

Binning data according to a threshold?

Tags:

r

Let's say I have a response variable that rises and falls over time. Each time the response variable rises above a threshold, we have a new "Trial." That is, if I add a column Threshold that is TRUE whenever above a certain value, consecutive blocks of data points where Threshold is TRUE constitute a new trial.

Time <- seq(1, 10, by = 0.5)
Response <- abs(sin(Time))
Threshold <- Response > 0.6
data <- data.frame(Time, Response, Threshold)

Given Time, Response, and Threshold, how could I go about adding a Trial factor that has a new value for each group of TRUE thresholds? Something like this:

   Time   Response Threshold Trial
1   1.0 0.84147098      TRUE A
2   1.5 0.99749499      TRUE A
3   2.0 0.90929743      TRUE A
4   2.5 0.59847214     FALSE NA
5   3.0 0.14112001     FALSE NA
6   3.5 0.35078323     FALSE NA
7   4.0 0.75680250      TRUE B
8   4.5 0.97753012      TRUE B
9   5.0 0.95892427      TRUE B
10  5.5 0.70554033      TRUE B
11  6.0 0.27941550     FALSE NA
12  6.5 0.21511999     FALSE NA
13  7.0 0.65698660      TRUE C
14  7.5 0.93799998      TRUE C
15  8.0 0.98935825      TRUE C
16  8.5 0.79848711      TRUE C
17  9.0 0.41211849     FALSE NA
18  9.5 0.07515112     FALSE NA
19 10.0 0.54402111     FALSE NA

323

asked Jan 24 '14 04:01

sudo make install

2 Answers

data$Trial <- factor(
  ifelse(data$Threshold, cumsum(!data$Threshold), NA), labels = c("A", "B", "C")
)

##   Time   Response Threshold Trial
## 1   1.0 0.84147098      TRUE     A
## 2   1.5 0.99749499      TRUE     A
## 3   2.0 0.90929743      TRUE     A
## 4   2.5 0.59847214     FALSE  <NA>
## 5   3.0 0.14112001     FALSE  <NA>
## 6   3.5 0.35078323     FALSE  <NA>
## 7   4.0 0.75680250      TRUE     B
## 8   4.5 0.97753012      TRUE     B
## 9   5.0 0.95892427      TRUE     B
## 10  5.5 0.70554033      TRUE     B
## 11  6.0 0.27941550     FALSE  <NA>
## 12  6.5 0.21511999     FALSE  <NA>
## 13  7.0 0.65698660      TRUE     C
## 14  7.5 0.93799998      TRUE     C
## 15  8.0 0.98935825      TRUE     C
## 16  8.5 0.79848711      TRUE     C
## 17  9.0 0.41211849     FALSE  <NA>
## 18  9.5 0.07515112     FALSE  <NA>
## 19 10.0 0.54402111     FALSE  <NA>

200

answered Sep 29 '22 05:09

Jake Burkhead

Another possibility using rle:

r <- with(data, rle(Threshold))
len <- with(r, lengths[values])
n <- length(len)

trial <- rep(x = LETTERS[1:n], times = len)

data$Trial[data$Threshold] <- trial

data

answered Sep 29 '22 05:09

Henrik

Related questions
                            
                                Applying models to multiple time-series
                            
                                How to adjust the width of sidebarPanel without affect subsequent sidebarPanel widths in R Shiny
                            
                                Reshape data.frame with two columns into multiple columns with data (R)
                            
                                Incorporating cross validation in stepwise regression in R
                            
                                colors for two geom_point() in ggplot2 when using aes_string
                            
                                R: Faceted bar chart with percentages labels independent for each plot
                            
                                Proportionally sized arrows in ggplot
                            
                                data.table: vector scan v binary search with numeric columns - super-slow setkey
                            
                                obscure warning lme4 using lmer in optwrap
                            
                                Why the 'Measured negative execution time!' error appears? (And how to deal with it?)
                            
                                Rolling sum of time series with factor
                            
                                How to make relative tile sizes in ggplot2 with geom_tile?
                            
                                Assigning a data.table slice in R
                            
                                Omit floating and document environments from stargazer regression table output
                            
                                efficiently move environment from inside function to global environment
                            
                                Sort list of lists in R: sort one lists' value depending on other lists' value
                            
                                R: Create a new column in a data frame using a mapping from another data frame
                            
                                making sure a function does not use a global variable [duplicate]
                            
                                How to use the "[" function to select a row / column of a matrix
                            
                                How to map ggplot histogram x-axis intervals to fixed colour palette?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With