Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

data.table not accepting 'by' and 'format' (for date) at the same time

Tags:

r

data.table

I'm using data.table to find the mean date of a "session", but I'm having trouble trying to format it the way I want, and I'm confused as to what the problem is:

library( data.table )
data <- data.table( session = c( 1,1,1,1,2,2,2,2,2,2,3,3,3,3 ),
                    date = as.Date( c( "2016-01-01", "2016-01-02", "2016-01-03", "2016-01-03",
                                       "2016-04-30", "2016-04-30", "2016-05-03", "2016-05-03", "2016-05-03", "2016-05-03",
                                       "2016-08-28", "2016-08-28", "2016-08-28", "2016-08-28" ) )
)

What I want is to give each session a label, based on when that session was. I've decided to label each session as the month during which the session occurred (formatted as "%b-%Y"), but since the sessions sometimes cross over 2 months, I want to do this by taking the mean date of that session, and using that to decide on the label.

I can find the mean date of each session, using the by parameter:

output <- copy( data )[ , Month := mean( date ), by = session ]

I can also reformat a mean date the way I want within data.table:

output <- copy( data )[ , Month := format( mean( date ), "%b-%Y" ) ]

But I can't do both:

output <- copy( data )[ , Month := format( mean( date ), "%b-%Y" ), by = session ]

The above returns an error:

Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L,  : 
invalid 'trim' argument
In addition: Warning message:
In mean(date) : argument is not numeric or logical: returning NA

What am I doing wrong here? The code looks right to me, and each part works fine, so why isn't this working?

Note I can do what I need in two steps (below), and it works OK, but I'm interested to know what it is I'm missing. Something is wrong in the above code, I just can't see what it is :

output <- copy( data )[ , Month := mean( date ), by = session 
                        ][ , Month := format( Month, "%b-%Y" ) ]
like image 345
rosscova Avatar asked Oct 19 '22 00:10

rosscova


1 Answers

It works if you use mean.Date instead of mean:

output <- copy( data )[ , Month := format( mean.Date( date ), format="%b-%Y" ), by = session ]

That way it utilizes format.Date

like image 150
HubertL Avatar answered Oct 30 '22 16:10

HubertL