I have a problem with the cut function. I have this situation:
codice
1 11GP2-0016
2 11GP2-0016
3 11GP2-0016
4 11OL2-074
5 11OL2-074
and I would like to have a new variable "campione" splitted by variable "codice" like this:
codice campione
1 11GP2-0016 [1,3]
2 11GP2-0016 [1,3]
3 11GP2-0016 [1,3]
4 11OL2-074 (4,5]
5 11OL2-074 (4,5]
How can I use the cut function to split the "codice" creating a variable showing that from 1 to 3 i have the same code, from 4 to 5 same code and so on?
I need to solve another question. For the same issue I would like to obtain:
codice campione
1 11GP2-0016 [11GP2-0016,11GP2-0016,11GP2-0016]
2 11GP2-0016 [11GP2-0016,11GP2-0016,11GP2-0016]
3 11GP2-0016 [11GP2-0016,11GP2-0016,11GP2-0016]
4 11OL2-074 (11OL2-074,11OL2-074]
5 11OL2-074 (11OL2-074,11OL2-074]
Is there any solution to do this?
This will do it. You can add brackets/parens, if you want.
dat <- read.table(text='codice
1 11GP2-0016
2 11GP2-0016
3 11GP2-0016
4 11OL2-074
5 11OL2-074', header=TRUE)
within(dat,
campione <- with(rle(as.character(codice)), {
starts <- which(! duplicated(codice))
ends <- starts + lengths - 1
inverse.rle(list(values=paste(starts, ends, sep=','), lengths=lengths))
})
)
# codice campione
# 1 11GP2-0016 1,3
# 2 11GP2-0016 1,3
# 3 11GP2-0016 1,3
# 4 11OL2-074 4,5
# 5 11OL2-074 4,5
Using your data:
d <- read.table(text = "1 11GP2-0016
2 11GP2-0016
3 11GP2-0016
4 11OL2-074
5 11OL2-074", row.names = 1, stringsAsFactors = FALSE)
names(d) <- "codice"
Here is a slightly convoluted example using rle()
:
drle <- with(d, rle(codice))
This gives us the run lengths of codice
:
> drle
Run Length Encoding
lengths: int [1:2] 3 2
values : chr [1:2] "11GP2-0016" "11OL2-074"
and it is the $lengths
component that I manipulate to create two indicates, the start (ind1
) and the end (ind2
) location:
ind1 <- with(drle, rep(seq_along(lengths), times = lengths) +
rep(c(0, head(lengths, -1) - 1), times = lengths))
ind2 <- ind1 + with(drle, rep(lengths- 1, times = lengths))
Then I just paste these together:
d <- transform(d, campione = paste0("[", ind1, ",", ind2, "]"))
Giving
> head(d)
codice campione
1 11GP2-0016 [1,3]
2 11GP2-0016 [1,3]
3 11GP2-0016 [1,3]
4 11OL2-074 [4,5]
5 11OL2-074 [4,5]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With