Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert R data table column from JSON to data table

Tags:

json

r

data.table

I have a column that contains JSON data as in the following example,

library(data.table)
test <- data.table(a = list(1,2,3), 
           info = list("{'duration': '10', 'country': 'US'}", 
                       "{'duration': '20', 'country': 'US'}",
                       "{'duration': '30', 'country': 'GB', 'width': '20'}"))

I want to convert the last column to equivalent R storage, which would look similar to,

res <- data.table(a = list(1, 2, 3),
                  duration = list(10, 20, 30),
                  country = list('US', 'US', 'GB'),
                  width = list(NA, NA, 20))

Since I have 500K rows with different contents I would look for a quick way to do this.

like image 883
Stereo Avatar asked Oct 24 '16 18:10

Stereo


People also ask

Can we convert JSON to table?

Yes, ImportJSON is a really easy tool to use for taking information from JSON and putting it into a table or spreadsheet. Including if you want to parse your JSON directly from Google Sheets!

How do I convert a JSON to a Dataframe in R?

Convert JSON into a dataframe We simply use the fromJSON() function to read data from the data. json file and pass loaded data to the as. data. frame() method to convert into a data frame.

What is Jsonlite?

jsonlite: A Simple and Robust JSON Parser and Generator for R. A reasonably fast JSON parser and generator, optimized for statistical data and the web. Offers simple, flexible tools for working with JSON in R, and is particularly powerful for building pipelines and interacting with a web API.


2 Answers

A variation without the need to separate out the JSON string

library(data.table)
library(jsonlite)

test[, info := gsub("'", "\"", info)]
test[, rbindlist(lapply(info, fromJSON), use.names = TRUE, fill = TRUE)]

#    duration country width
# 1:       10      US    NA
# 2:       20      US    NA
# 3:       30      GB    20
like image 67
SymbolixAU Avatar answered Oct 14 '22 02:10

SymbolixAU


Parse the JSON first, then build the data.frame (or data.table):

json_string <- paste(c("[{'duration': '10', 'country': 'US'}", 
    "{'duration': '20', 'country': 'US'}",
  "{'duration': '30', 'country': 'GB'}",
  "{'width': '20'}]"), collapse=", ")

# JSON standard requires double quotes
json_string <- gsub("'", "\"", json_string)

library("jsonlite")
fromJSON(json_string)

#  duration country width
# 1       10      US  <NA>
# 2       20      US  <NA>
# 3       30      GB  <NA>
# 4     <NA>    <NA>    20

This isn't exactly what you asked for as your JSON doesn't associate 'width' with the previous record, you might need to do some manipulation first:

json_string <- paste(c("[{'duration': '10', 'country': 'US'}", 
    "{'duration': '20', 'country': 'US'}",
  "{'duration': '30', 'country': 'GB', 'width': '20'}]"), 
  collapse=", ")

json_string <- gsub("'", "\"", json_string)
df <- jsonlite::fromJSON(json_string)
data.table::as.data.table(df)

#    duration country width
# 1:       10      US    NA
# 2:       20      US    NA
# 3:       30      GB    20
like image 23
blmoore Avatar answered Oct 14 '22 02:10

blmoore