library(rjson)
filenames <- list.files(pattern="*.json") # gives a character vector, with each file name represented by an entry
Now I want to import all the JSON files into R as one single dataFrame. How do I do that?
I first tried
myJSON <- lapply(filenames, function(x) fromJSON(file=x)) # should return a list in which each element is one of the JSON files
but the above code takes along time to terminate, since I have 15,000 files, and I know it won't return a single data frame. Is there a faster way to do this?
Sample JSON file:
{"Reviews": [{"Ratings": {"Service": "4", "Cleanliness": "5"}, "AuthorLocation": "Boston", "Title": "\u201cExcellent Hotel & Location\u201d", "Author": "gowharr32", "ReviewID": "UR126946257", "Content": "We enjoyed the Best Western Pioneer Square....", "Date": "March 29, 2012"}, {"Ratings": {"Overall": "5"},"AuthorLocation": "Chicago",....},{...},....}]}
For anyone looking for a purrr / tidyverse solution coming here:
library(purrr)
library(tidyverse)
library(jsonlite)
path <- "./your_path"
files <- dir(path, pattern = "*.json")
data <- files %>%
map_df(~fromJSON(file.path(path, .), flatten = TRUE))
Go parallel via:
library(parallel)
cl <- makeCluster(detectCores() - 1)
json_files<-list.files(path ="your/json/path",pattern="*.json",full.names = TRUE)
json_list<-parLapply(cl,json_files,function(x) rjson::fromJSON(file=x,method = "R"))
stopCluster(cl)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With