I am looking to create a new data.table that contains all rows with at least one negative value.
Here is a simple reproducible datatable:
dt <- data.table(
ID = c(42, 43, 44),
Stage_1 = c(-6, 7, 4),
Stage_2 = c(-15, 4, -8),
Stage_3 = c(-20, 2, -5)
)
# ID Stage_1 Stage_2 Stage_3
# 1: 42 -6 -15 -20 # <~~ row to be selected (> 0 negative values)
# 2: 43 7 4 2
# 3: 44 4 -8 -5 # <~~ row to be selected (> 0 negative values)
My desired output would be:
dt2 <- data.table(
ID = c(42, 44),
Stage_1 = c(-6, 4),
Stage_2 = c(-15, -8),
Stage_3 = c(-20, -5)
)
# ID Stage_1 Stage_2 Stage_3
# 1: 42 -6 -15 -20
# 2: 44 4 -8 -5
ID 44 for example, has two negative values but I would like to include all of their rows from the main datatable. Basically all rows with a negative value in any of their columns I would like to add to a new datatable that contains all their information.
The actual dataset I'm working with has ~50 stage columns, so the most efficient solution is what I'm after.
dt <- data.table::data.table(
ID = c(42, 43, 44),
Stage_1 = c(-6, 7, 4),
Stage_2 = c(-15, 4, -8),
Stage_3 = c(-20, 2, -5)
)
dt
#> ID Stage_1 Stage_2 Stage_3
#> 1: 42 -6 -15 -20
#> 2: 43 7 4 2
#> 3: 44 4 -8 -5
apply() function:dt[apply(dt[, -'ID'], 1, min) < 0, ]
#> ID Stage_1 Stage_2 Stage_3
#> 1: 42 -6 -15 -20
#> 2: 44 4 -8 -5
rowMeans() function based on the fact that average of a boolean vector with at least one true value is always greater than zero (thanks to @utubun):dt[rowMeans(dt[, -'ID'] < 0) > 0, ]
#> ID Stage_1 Stage_2 Stage_3
#> 1: 42 -6 -15 -20
#> 2: 44 4 -8 -5
rowMins() function of the fBasics package:dt[fBasics::rowMins(dt[, -'ID']) < 0, ]
#> ID Stage_1 Stage_2 Stage_3
#> 1: 42 -6 -15 -20
#> 2: 44 4 -8 -5
# Created on 2021-02-19 by the reprex package (v0.3.0.9001)
(Related to Equivalent to rowMeans() for min())
Regards,
dt[Stage_1 < 0 | Stage_2 < 0 | Stage_3 < 0]
# ID Stage_1 Stage_2 Stage_3
# 1: 42 -6 -15 -20
# 2: 44 4 -8 -5
Edit after OP's clarification:
With many columns:
# Find all the rows with at least one negative in each column
rowsDT <- dt[, lapply(.SD, function(x) which(x < 0)), .SDcols = -'ID']
# Reduce to a vector by applying union
rows <- Reduce(union, rowsDT)
# Extract from the main data.table
dt[rows]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With