Is there a way to find the indices of change of factors in a column with R? For example:
x <- c("aaa", "aaa", "aaa", "bbb", "bbb", "ccc", "ddd")
would return 3, 5, 6
An index is a database structure that provides quick lookup of data in a column or columns of a table.
You can get the column index from the column name in Pandas using DataFrame. columns. get_loc() method. DataFrame.
If you want to keep the original index as a column, use reset_index() to reassign the index to a sequential number starting from 0 . You can change the index to a different column by using set_index() after reset_index() .
You could try to compare shifted vectors, e.g.
which(x[-1] != x[-length(x)])
## [1] 3 5 6
This will work both on characters and factors
which(!!diff(as.numeric(x)))
[1] 3 5 6
The assumption is that you really have factors. They are saved internally with numerical values. So when the difference is taken, a one will result at every change. A second coercion is that zeroes are considered FALSE and other numbers TRUE. which
locates the TRUE values aka non-zeroes.
rle
can be used for this:
head(cumsum(rle(x)$lengths), -1)
[1] 3 5 6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With