Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When and why does "print" need two attempts to print a "data.table"? [duplicate]

Tags:

r

data.table

Sometimes print needs two attempts to print a data.table:

> library(data.table)
> 
> rm(list=ls())
> 
> Tbl <- fread( input = "Nr; Value
+                        Nr 1;46.73
+                        Nr 2;49.02
+                        Nr 3;50.62
+                        Nr 4;49.80
+                        Nr 5;50.15",
+               sep    = ";",
+               header = TRUE,
+               colClasses = c("character","numeric") )
> print(Tbl)
     Nr Value
1: Nr 1 46.73
2: Nr 2 49.02
3: Nr 3 50.62
4: Nr 4 49.80
5: Nr 5 50.15
> Tbl <- Tbl[, Nr := as.numeric( gsub( "Nr ", "", Tbl$Nr ))]
> print(Tbl)
> print(Tbl)
   Nr Value
1:  1 46.73
2:  2 49.02
3:  3 50.62
4:  4 49.80
5:  5 50.15
> 

Not so a data.frame:

> rm(list=ls())
> 
> DF <- read.table( text = "Nr; Value
+                           Nr 1;46.73
+                           Nr 2;49.02
+                           Nr 3;50.62
+                           Nr 4;49.80
+                           Nr 5;50.15",
+                   sep    = ";",
+                   header = TRUE,
+                   colClasses = c("character","numeric"))
> 
> DF$Nr <- as.numeric( gsub( "Nr ", "", DF$Nr ))
> print(DF)
  Nr Value
1  1 46.73
2  2 49.02
3  3 50.62
4  4 49.80
5  5 50.15
> 

If the code is contained in a script file, the data.table is printed immediately:

> source(path_to_Script_1,echo=TRUE,prompt.echo="(script)   ",max.deparse.length=500)

(script)   library(data.table)

(script)   rm(list=ls())

(script)   Tbl <- fread( input = "Nr; Value
+                        Nr 1;46.73
+                        Nr 2;49.02
+                        Nr 3;50.62
+                        Nr 4;49.80
+                        Nr 5;50.15",
+               sep    = ";",
+               header = TRUE,
+               colClasses = c("character","numeric") )

(script)   Tbl <- Tbl[, Nr := as.numeric( gsub( "Nr ", "", Tbl$Nr ))]

(script)   print(Tbl)
   Nr Value
1:  1 46.73
2:  2 49.02
3:  3 50.62
4:  4 49.80
5:  5 50.15
> 

But if print(Tbl) is omitted from the script file, print on the console again needs two attempts:

> source(path_to_Script_2,echo=TRUE,prompt.echo="(script)   ",max.deparse.length=500)

(script)   library(data.table)

(script)   rm(list=ls())

(script)   Tbl <- fread( input = "Nr; Value
+                        Nr 1;46.73
+                        Nr 2;49.02
+                        Nr 3;50.62
+                        Nr 4;49.80
+                        Nr 5;50.15",
+               sep    = ";",
+               header = TRUE,
+               colClasses = c("character","numeric") )

(script)   Tbl <- Tbl[, Nr := as.numeric( gsub( "Nr ", "", Tbl$Nr ))]
> print(Tbl)
> print(Tbl)
   Nr Value
1:  1 46.73
2:  2 49.02
3:  3 50.62
4:  4 49.80
5:  5 50.15
> 

Can anybody tell me when ans why print needs two attempts? I'm using R version 3.2.2:

> R.version
               _                           
platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          3                           
minor          2.2                         
year           2015                        
month          08                          
day            14                          
svn rev        69053                       
language       R                           
version.string R version 3.2.2 (2015-08-14)
nickname       Fire Safety   

Operating system is Windows 7.

like image 603
mra68 Avatar asked Dec 14 '15 15:12

mra68


1 Answers

Quoting the NEWS (1. bug fix in version 1.9.6):

if (TRUE) DT[,LHS:=RHS] no longer prints, #869 and #1122. Tests added. To get this to work we've had to live with one downside: if a := is used inside a function with no DT[] before the end of the function, then the next time DT or print(DT) is typed at the prompt, nothing will be printed. A repeated DT or print(DT) will print. To avoid this: include a DT[] after the last := in your function. If that is not possible (e.g., it's not a function you can change) then DT[] at the prompt is guaranteed to print.

<- is a function. Of course, you shouldn't use <- there at all. It creates an unnecessary copy (since := modifies the data.table column in place) and is therefore inefficient.

like image 144
Roland Avatar answered Oct 07 '22 11:10

Roland