Add a row by reference at the end of a data.table object

Tags:

In this question the data.table package creator explains why rows cannot be inserted (or removed) by reference in the middle a data.table yet. He also points out that such operations could be possible at end of the table. Could you show a code to perfome this action? It would be the "by reference" version of

a<- data.table(id=letters[1:2], var=1:2) > a    id var 1:  a   1 2:  b   2 > rbind(a, data.table(id="c", var=3))    id var 1:  a   1 2:  b   2 3:  c   3

thanks.

EDIT:

since a proper solution is not possible yet, which of the following is better (if internally different, not sure) either from a speed and memory usage perpective?

rbind(a, data.table(id="c", var=3))  rbindlist(list(a,  data.table(id="c", var=3)))

are there eventually other (better) methods?

578

asked May 28 '13 12:05

Michele

1 Answers

To answer your edit, just run a benchmark:

a = data.table(id=letters[1:2], var=1:2) b = copy(a) c = copy(b) # let's also just try modifying same value in place             # to see how well changing existing values does microbenchmark(a <- rbind(a, data.table(id="c", var=3)),                b <- rbindlist(list(b,  data.table(id="c", var=3))),                c[1, var := 3L],                set(c, 1L, 2L, 3L)) #Unit: microseconds #                                                  expr     min        lq    median        uq      max neval #          a <- rbind(a, data.table(id = "c", var = 3)) 865.460 1141.2585 1357.1230 1539.4300 6814.492   100 #b <- rbindlist(list(b, data.table(id = "c", var = 3))) 260.440  325.3835  445.4190  522.8825 1143.930   100 #                                   c[1, `:=`(var, 3L)] 482.147  626.5570  778.3135  904.3595 1109.539   100 #                                    set(c, 1L, 2L, 3L)   2.339    5.677    7.5140    9.5170   19.033   100

rbindlist is clearly better than rbind. Thanks to Matthew Dowle pointing out the problems with using [ in a loop, I added another benchmark with set.

From the above your best options are using rbindlist, or sizing the data.table to begin with and then just populating the values (you can also use a similar strategy to std::vector in C++, and double the size every time you run out of space, if you don't know the size of the data to begin with, and then once you're done filling it in, delete the extra rows).

answered Sep 29 '22 10:09

eddi

Related questions
                            
                                How to suppress warning messages when loading a library?
                            
                                'Reset inputs' button in shiny app
                            
                                How do I convert certain columns of a data frame to become factors? [duplicate]
                            
                                Adding space between bars in ggplot2
                            
                                Difference between rbind() and bind_rows() in R
                            
                                What's the use of which?
                            
                                How can I plot data with confidence intervals?
                            
                                What is the meaning of the dollar sign "$" in R function()?
                            
                                Get "embedded nul(s) found in input" when reading a csv using read.csv()
                            
                                ggplot2 - shade area between two vertical lines [duplicate]
                            
                                Avoid rbind()/cbind() conversion from numeric to factor
                            
                                How to fix 'tar: Failed to set default locale' error?
                            
                                Comparing two vectors in an if statement
                            
                                Use variable names in functions of dplyr
                            
                                Consolidate duplicate rows
                            
                                Generate a sequence of the last day of the month over two years
                            
                                Building R package and error "ld: cannot find -lgfortran"
                            
                                Convert date-time string to class Date
                            
                                Rename multiple dataframe columns, referenced by current names
                            
                                Convert radians to degree / degree to radians

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Add a row by reference at the end of a data.table object

Tags:

r

data.table

Michele

People also ask

1 Answers

eddi

Recent Activity

Donate For Us