Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count and flag unique values in r dataframe

Tags:

r

I have the following dataframe:

data <- data.frame(week = c(rep("2014-01-06", 3), rep("2014-01-13", 3), rep("2014-01-20", 3)), values = c(1, 2, 3))

         week values
1 2014-01-06      1
2 2014-01-06      2
3 2014-01-06      3
4 2014-01-13      1
5 2014-01-13      2
6 2014-01-13      3
7 2014-01-20      1
8 2014-01-20      2
9 2014-01-20      3

I'm wanting to create a column in data that counts the unique week and assigns it a sequential value, such that the df appears like this:

         week values seq_value
1 2014-01-06      1  1
2 2014-01-06      2  1
3 2014-01-06      3  1
4 2014-01-13      1  2
5 2014-01-13      2  2
6 2014-01-13      3  2
7 2014-01-20      1  3
8 2014-01-20      2  3
9 2014-01-20      3  3
like image 680
gh0strider18 Avatar asked Dec 14 '25 07:12

gh0strider18


2 Answers

I guess the idiomatic way would be just to calculate the actual week of the year out of the date provided (in case your weeks are not starting from the first week of the year).

as.integer(format(as.Date(data$week), "%W"))
## [1] 1 1 1 2 2 2 3 3 3

Another base R solution would be using as.POSIXlt class and utilizing its yday attribute

as.POSIXlt(data$week)$yday %/% 7 + 1
## [1] 1 1 1 2 2 2 3 3 3

If you want a shorter syntax, data.table package (among many others - See @Kshashaas comment) offers a quick wrapper

library(data.table)
week(data$week)
## [1] 1 1 1 2 2 2 3 3 3

The nicest thing about this package is that you can create columns by reference (similar to @akruns last solution, but probably more efficient because doesn't require the by argument)

setDT(data)[, seq_value := week(week)]
like image 116
David Arenburg Avatar answered Dec 16 '25 21:12

David Arenburg


You could use base R by converting the "week" column to factor and specifying the levels as the unique values of "week". Convert factor to numeric and get the numeric index of the levels.

 data$seq_value <- with(data, as.numeric(factor(week,levels=unique(week) )))
 data$seq_value
 #[1] 1 1 1 2 2 2 3 3 3

Or match the "week" column to unique values of that column to get the numeric index.

  with(data, match(week, unique(week)))
  #[1] 1 1 1 2 2 2 3 3 3

Or using data.table, by first converting data.frame to data.table (setDT) and then get the index values (.GRP) of grouping variable 'week' and assign it to new column seq_value

 library(data.table)
 setDT(data)[,seq_value:=.GRP, week][]
like image 27
akrun Avatar answered Dec 16 '25 23:12

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!