I have the following data.table that I want to perform a multiple column melt upon:
# Load library
library(data.table)
# Dummy data set
dt <- data.table(a = 1:10,
t_alpha = runif(10),
t_beta = runif(10),
t_gamma = runif(10),
s_alpha = runif(10),
s_beta = runif(10),
s_gamma = runif(10))
# Look at data table
dt
#> a t_alpha t_beta t_gamma s_alpha s_beta s_gamma
#> 1: 1 0.73315322 0.7777962 0.13258401 0.12841353 0.20865995 0.8967482
#> 2: 2 0.10161042 0.9691244 0.23506277 0.35130324 0.36672933 0.9247026
#> 3: 3 0.95234142 0.7002526 0.07104051 0.85558314 0.39467530 0.8982107
#> 4: 4 0.48967202 0.3951288 0.72376459 0.58978334 0.47201394 0.8508710
#> 5: 5 0.22997537 0.3487354 0.58831312 0.63917860 0.12685815 0.5616682
#> 6: 6 0.01138277 0.0449675 0.79384717 0.28983485 0.98793013 0.6574457
#> 7: 7 0.32324941 0.3446745 0.66394161 0.16357679 0.14828451 0.2019661
#> 8: 8 0.78644407 0.4460397 0.76992857 0.78703793 0.46623055 0.3081812
#> 9: 9 0.90938120 0.4526063 0.47948163 0.03182248 0.63600496 0.3598600
#> 10: 10 0.11345349 0.1732653 0.59240218 0.95541944 0.06640378 0.9475107
So, I go ahead and melt, like so:
# Groups of columns to melt
m <- list(names(dt)[grepl("t_", names(dt))],
names(dt)[grepl("s_", names(dt))])
# Perform melt
dt2 <- melt(dt, measure = m, value.name = c("t", "s"))
which gives,
# Examine results
dt2
#> a variable t s
#> 1: 1 1 0.73315322 0.12841353
#> 2: 2 1 0.10161042 0.35130324
#> 3: 3 1 0.95234142 0.85558314
#> 4: 4 1 0.48967202 0.58978334
#> 5: 5 1 0.22997537 0.63917860
#> 6: 6 1 0.01138277 0.28983485
#> 7: 7 1 0.32324941 0.16357679
#> 8: 8 1 0.78644407 0.78703793
#> 9: 9 1 0.90938120 0.03182248
#> 10: 10 1 0.11345349 0.95541944
#> 11: 1 2 0.77779618 0.20865995
#> 12: 2 2 0.96912444 0.36672933
#> 13: 3 2 0.70025264 0.39467530
#> 14: 4 2 0.39512877 0.47201394
#> 15: 5 2 0.34873538 0.12685815
#> 16: 6 2 0.04496750 0.98793013
#> 17: 7 2 0.34467455 0.14828451
#> 18: 8 2 0.44603969 0.46623055
#> 19: 9 2 0.45260634 0.63600496
#> 20: 10 2 0.17326532 0.06640378
#> 21: 1 3 0.13258401 0.89674825
#> 22: 2 3 0.23506277 0.92470265
#> 23: 3 3 0.07104051 0.89821074
#> 24: 4 3 0.72376459 0.85087097
#> 25: 5 3 0.58831312 0.56166816
#> 26: 6 3 0.79384717 0.65744574
#> 27: 7 3 0.66394161 0.20196612
#> 28: 8 3 0.76992857 0.30818121
#> 29: 9 3 0.47948163 0.35986003
#> 30: 10 3 0.59240218 0.94751075
#> a variable t s
That's great!
Now, notice an extra column, variable
. That corresponds to each of my columns for t
and s
. Specifically, 1 = alpha, 2 = beta, and 3 = gamma. I can fix this variable as follows:
# Fix variable names
v_names <- c("alpha", "beta", "gamma")
dt2$variable <- v_names[dt2$variable]
producing,
# Examine results
dt2
#> a variable t s
#> 1: 1 alpha 0.73315322 0.12841353
#> 2: 2 alpha 0.10161042 0.35130324
#> 3: 3 alpha 0.95234142 0.85558314
#> 4: 4 alpha 0.48967202 0.58978334
#> 5: 5 alpha 0.22997537 0.63917860
#> 6: 6 alpha 0.01138277 0.28983485
#> 7: 7 alpha 0.32324941 0.16357679
#> 8: 8 alpha 0.78644407 0.78703793
#> 9: 9 alpha 0.90938120 0.03182248
#> 10: 10 alpha 0.11345349 0.95541944
#> 11: 1 beta 0.77779618 0.20865995
#> 12: 2 beta 0.96912444 0.36672933
#> 13: 3 beta 0.70025264 0.39467530
#> 14: 4 beta 0.39512877 0.47201394
#> 15: 5 beta 0.34873538 0.12685815
#> 16: 6 beta 0.04496750 0.98793013
#> 17: 7 beta 0.34467455 0.14828451
#> 18: 8 beta 0.44603969 0.46623055
#> 19: 9 beta 0.45260634 0.63600496
#> 20: 10 beta 0.17326532 0.06640378
#> 21: 1 gamma 0.13258401 0.89674825
#> 22: 2 gamma 0.23506277 0.92470265
#> 23: 3 gamma 0.07104051 0.89821074
#> 24: 4 gamma 0.72376459 0.85087097
#> 25: 5 gamma 0.58831312 0.56166816
#> 26: 6 gamma 0.79384717 0.65744574
#> 27: 7 gamma 0.66394161 0.20196612
#> 28: 8 gamma 0.76992857 0.30818121
#> 29: 9 gamma 0.47948163 0.35986003
#> 30: 10 gamma 0.59240218 0.94751075
#> a variable t s
Created on 2019-08-09 by the reprex package (v0.3.0)
This is my desired result. My question is, can I perform this operation (i.e., renaming of the values in the variable
column) during the melt or do I just have to patch it up afterwards?
DataTable variables represent a type of variable that can store big pieces of information, and act as a database or a simple spreadsheet with rows and columns. They can be found in the Browse and Select a . Net Type window, under the System. Data namespace (System.
dcast: Convert data between wide and long forms.
Melting in R It is performed using melt() function which takes dataset and column values that has to be kept constant. Using melt(), dataframe is converted into long format and stretches the data frame.
Reshape2 is a package that allows us to easily transform our data into whatever structure we may need. Many of us are used to seeing our data structured so that corresponds to a single participant and each column corresponds to a variable. This type of data structure is known as wide format.
Here is an option in pivot_longer
from the dev version of tidyr
library(tidyr)# ‘0.8.3.9000’
library(dplyr)
dt %>%
pivot_longer(-a, names_to= c(".value", "variable"), names_sep = "_")
# A tibble: 30 x 4
# a variable t s
# <int> <chr> <dbl> <dbl>
# 1 1 alpha 0.592 0.0634
# 2 1 beta 0.316 0.339
# 3 1 gamma 0.760 0.751
# 4 2 alpha 0.936 0.445
# 5 2 beta 0.232 0.00250
# 6 2 gamma 0.0111 0.406
# 7 3 alpha 0.0950 0.102
# 8 3 beta 0.426 0.775
# 9 3 gamma 0.321 0.811
#10 4 alpha 0.884 0.0526
# … with 20 more rows
NOTE: The values are different as there was no set.seed
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With