I have a column that contains JSON data as in the following example,
library(data.table)
test <- data.table(a = list(1,2,3),
info = list("{'duration': '10', 'country': 'US'}",
"{'duration': '20', 'country': 'US'}",
"{'duration': '30', 'country': 'GB', 'width': '20'}"))
I want to convert the last column to equivalent R storage, which would look similar to,
res <- data.table(a = list(1, 2, 3),
duration = list(10, 20, 30),
country = list('US', 'US', 'GB'),
width = list(NA, NA, 20))
Since I have 500K rows with different contents I would look for a quick way to do this.
Yes, ImportJSON is a really easy tool to use for taking information from JSON and putting it into a table or spreadsheet. Including if you want to parse your JSON directly from Google Sheets!
Convert JSON into a dataframe We simply use the fromJSON() function to read data from the data. json file and pass loaded data to the as. data. frame() method to convert into a data frame.
jsonlite: A Simple and Robust JSON Parser and Generator for R. A reasonably fast JSON parser and generator, optimized for statistical data and the web. Offers simple, flexible tools for working with JSON in R, and is particularly powerful for building pipelines and interacting with a web API.
A variation without the need to separate out the JSON string
library(data.table)
library(jsonlite)
test[, info := gsub("'", "\"", info)]
test[, rbindlist(lapply(info, fromJSON), use.names = TRUE, fill = TRUE)]
# duration country width
# 1: 10 US NA
# 2: 20 US NA
# 3: 30 GB 20
Parse the JSON first, then build the data.frame (or data.table):
json_string <- paste(c("[{'duration': '10', 'country': 'US'}",
"{'duration': '20', 'country': 'US'}",
"{'duration': '30', 'country': 'GB'}",
"{'width': '20'}]"), collapse=", ")
# JSON standard requires double quotes
json_string <- gsub("'", "\"", json_string)
library("jsonlite")
fromJSON(json_string)
# duration country width
# 1 10 US <NA>
# 2 20 US <NA>
# 3 30 GB <NA>
# 4 <NA> <NA> 20
This isn't exactly what you asked for as your JSON doesn't associate 'width' with the previous record, you might need to do some manipulation first:
json_string <- paste(c("[{'duration': '10', 'country': 'US'}",
"{'duration': '20', 'country': 'US'}",
"{'duration': '30', 'country': 'GB', 'width': '20'}]"),
collapse=", ")
json_string <- gsub("'", "\"", json_string)
df <- jsonlite::fromJSON(json_string)
data.table::as.data.table(df)
# duration country width
# 1: 10 US NA
# 2: 20 US NA
# 3: 30 GB 20
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With