I am looking for an efficient (both computer resource wise and learning/implementation wise) method to merge two larger (size>1 million / 300 KB RData file) data frames.
"merge" in base R and "join" in plyr appear to use up all my memory effectively crashing my system.
Example
load test data frame
and try
test.merged<-merge(test, test)
or
test.merged<-join(test, test, type="all")
The following post provides a list of merge and alternatives:
How to join (merge) data frames (inner, outer, left, right)?
The following allows object size inspection:
https://heuristically.wordpress.com/2010/01/04/r-memory-usage-statistics-variable/
Data produced by anonym
For large tables dplyr join functions is much faster than merge().
The merge function in R allows you to combine two data frames, much like the join function that is used in SQL to combine data tables. Merge , however, does not allow for more than two data frames to be joined at once, requiring several lines of code to join multiple data frames.
In R we use merge() function to merge two dataframes in R. This function is present inside join() function of dplyr package. The most important condition for joining two dataframes is that the column type should be the same on which the merging happens. merge() function works similarly like join in DBMS.
Here are some timings for the data.table vs. data.frame methods.
Using data.table is very much faster. Regarding memory, I can informally report that the two methods are very similar (within 20%) in RAM use.
library(data.table) set.seed(1234) n = 1e6 data_frame_1 = data.frame(id=paste("id_", 1:n, sep=""), factor1=sample(c("A", "B", "C"), n, replace=TRUE)) data_frame_2 = data.frame(id=sample(data_frame_1$id), value1=rnorm(n)) data_table_1 = data.table(data_frame_1, key="id") data_table_2 = data.table(data_frame_2, key="id") system.time(df.merged <- merge(data_frame_1, data_frame_2)) # user system elapsed # 17.983 0.189 18.063 system.time(dt.merged <- merge(data_table_1, data_table_2)) # user system elapsed # 0.729 0.099 0.821
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With