Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

melt.data.frame() changes behavior how POSIXct columns are printed

Melting the dataframe t.wide changes how the column "time" (class POSIXct) is printed.

t.wide <- data.frame(product=letters[1:5], 
                     result=c(2, 4, 0, 0, 1), 
                     t1=as.POSIXct("2014-05-26") + seq(0, 10800, length.out=5),
                     t2=as.POSIXct("2014-05-27") + seq(0, 10800, length.out=5),
                     t3=as.POSIXct("2014-05-28") + seq(0, 10800, length.out=5))

library(reshape2)     
t.long <- melt(t.wide, measure.vars=c("t1", "t2", "t3"), value.name="time")
t.long$time
 [1] 1401055200 1401057900 1401060600 1401063300 1401066000 1401141600 1401144300
 [8] 1401147000 1401149700 1401152400 1401228000 1401230700 1401233400 1401236100
[15] 1401238800
attr(,"class")
[1] "POSIXct" "POSIXt" 

Strangely, if print() is called explicitly, the object is printed as expected (timestamps, not their numeric representation).

print(t.long$time)
 [1] "2014-05-26 00:00:00 CEST" "2014-05-26 00:45:00 CEST" "2014-05-26 01:30:00 CEST"
 [4] "2014-05-26 02:15:00 CEST" "2014-05-26 03:00:00 CEST" "2014-05-27 00:00:00 CEST"
 [7] "2014-05-27 00:45:00 CEST" "2014-05-27 01:30:00 CEST" "2014-05-27 02:15:00 CEST"
[10] "2014-05-27 03:00:00 CEST" "2014-05-28 00:00:00 CEST" "2014-05-28 00:45:00 CEST"
[13] "2014-05-28 01:30:00 CEST" "2014-05-28 02:15:00 CEST" "2014-05-28 03:00:00 CEST"

Setting the attributes to the same value as before magically changes how the object is printed.

attributes(t.long$time) <- attributes(t.long$time)
t.long$time
 [1] "2014-05-26 00:00:00 CEST" "2014-05-26 00:45:00 CEST" "2014-05-26 01:30:00 CEST"
 [4] "2014-05-26 02:15:00 CEST" "2014-05-26 03:00:00 CEST" "2014-05-27 00:00:00 CEST"
 [7] "2014-05-27 00:45:00 CEST" "2014-05-27 01:30:00 CEST" "2014-05-27 02:15:00 CEST"
[10] "2014-05-27 03:00:00 CEST" "2014-05-28 00:00:00 CEST" "2014-05-28 00:45:00 CEST"
[13] "2014-05-28 01:30:00 CEST" "2014-05-28 02:15:00 CEST" "2014-05-28 03:00:00 CEST"

Can anyone explain this behavior?

like image 293
Tobias Avatar asked Jun 05 '14 11:06

Tobias


People also ask

What does melt () do in R?

The melt() function in R programming is an in-built function. It enables us to reshape and elongate the data frames in a user-defined manner. It organizes the data values in a long data frame format.

What is melt data in R?

Melting in R programming is done to organize the data. It is performed using melt() function which takes dataset and column values that has to be kept constant. Using melt(), dataframe is converted into long format and stretches the data frame.

What library is melt in R?

The melt function is to be found in the reshape package. If you do not have that package installed, then you will need to install it with install. packages("reshape") before you can use it. Then, when the package is installed, make it available with library(reshape) .


1 Answers

UPDATE:

I opened this as Issue #50 on the git repo hadley/reshape2.


UPDATE: FIXED

This issue has been fixed in the development version of reshape2.

Thanks @kevin-ushey!


I believe the reason is because after the reshaping for whatever reason R does not think that t.long$time has attributes. For some reason the OBJECT flag (which indicates the vector has attributes) in the SEXP header for your vector is not being set. When you copy the attributes back to it, the OBJECT flag gets set and the correct print method is dispatched...

# No "OBJ" in SEXP header (the '[NAM(2),ATT]' part below)
 .Internal(inspect( t.long$time ) )
#@10359e548 14 REALSXP g0c6 [NAM(2),ATT] (len=15, tl=0) 1.40106e+09,...

# Now we have "OBJ" in the SEXP header indicating attributes
# So the print method for POSIXct get dispatched...
attributes(t.long$time) <- attributes(t.long$time)
 .Internal(inspect( t.long$time ) )
#@1118d7f50 14 REALSXP g0c6 [OBJ,NAM(2),ATT] (len=15, tl=0) 1.40106e+09,...

From the R Internals document...

The actual autoprinting is done by PrintValueEnv in file print.c. If the object to be printed has the S4 bit set and S4 methods dispatch is on, show is called to print the object. Otherwise, if the object bit is set (so the object has a "class" attribute), print is called to dispatch methods: for objects without a class the internal code of print.default is called.

Check the difference between..

print.default(t.long$time)
# [1] 1401058800 1401061500 1401064200 1401066900 1401069600 1401145200 1401147900 1401150600 1401153300 1401156000 1401231600 1401234300
#[13] 1401237000 1401239700 1401242400
#attr(,"class")
#[1] "POSIXct" "POSIXt" 
print.POSIXct(t.long$time)
# [1] "2014-05-26 00:00:00 BST" "2014-05-26 00:45:00 BST" "2014-05-26 01:30:00 BST" "2014-05-26 02:15:00 BST" "2014-05-26 03:00:00 BST"
# [6] "2014-05-27 00:00:00 BST" "2014-05-27 00:45:00 BST" "2014-05-27 01:30:00 BST" "2014-05-27 02:15:00 BST" "2014-05-27 03:00:00 BST"
#[11] "2014-05-28 00:00:00 BST" "2014-05-28 00:45:00 BST" "2014-05-28 01:30:00 BST" "2014-05-28 02:15:00 BST" "2014-05-28 03:00:00 BST"

Now I can only speculate, but perhaps this is due to some internal code in reshape2 and is related to this warning..

One thing to watch is that if you copy attributes from one object to another you may (un)set the "class" attribute and so need to copy the object and S4 bits as well. There is a macro/function DUPLICATE_ATTRIB to automate this.

like image 75
Simon O'Hanlon Avatar answered Oct 07 '22 16:10

Simon O'Hanlon