I have a data.table
(dt
) which looks like this
# id value
# 1: 1 {1:3.2,2:14.2}
# 2: 2 {1:14.0,2:3.0}
# 3: 3 {1:3.4,2:3.9,3:0.1}
# 4: 4 {1:2.1}
The value
column comprises lists of values, where each one is introduced by its position in the list (1,2,3...) followed by a :
, then the value itself. list values are separated by commas, and the whole list is enclosed in braces {...}
.
I want to convert this to a long format (dt.all
) like
# id N value
# 1: 1 1 3.2
# 2: 1 2 14.2
# 3: 2 1 14.0
# 4: 2 2 3.0
# 5: 3 1 3.4
# 6: 3 2 3.9
# 7: 3 3 0.1
# 8: 4 1 2.1
Various failed attempts include:
stingr::str_extract(string = dt$value, pattern = "(?<=:).*(?=,)")
gsub(".*: *(.*?) *,.*", "\\1", dt$value)
gsubfn:strapplyc(dt$value, ":(.*?),", simplify = c)
Any suggestions on a neat way to do this? data.table
solutions preferred, but failing that will accept tidyverse or other methods.
The data:
dt = structure(list(id = 1:4, value = c("{1:3.2,2:14.2}", "{1:14.0,2:3.0}",
"{1:3.4,2:3.9,3:0.1}", "{1:2.1}")), row.names = c(NA, -4L), class = c("data.table",
"data.frame"))
dt.all = structure(list(id = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L), N = c(1L,
2L, 1L, 2L, 1L, 2L, 3L, 1L), value = c(3.2, 14.2, 14, 3, 3.4,
3.9, 0.1, 2.1)), row.names = c(NA, -8L), class = c("data.table",
"data.frame"))
I found that if I use gsub
to wrap the item numbers in quotes, then I can parse these as json
expressions using library(jsonlite)
.
dt[, .(value = {v1 <- unlist(fromJSON(gsub("([0-9]+):", '"\\1":', value)))},
N = names(v1)),by=id]
# id value N
# 1: 1 3.2 1
# 2: 1 14.2 2
# 3: 2 14.0 1
# 4: 2 3.0 2
# 5: 3 3.4 1
# 6: 3 3.9 2
# 7: 3 0.1 3
# 8: 4 2.1 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With