Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Successfully coercing paginated JSON object to R dataframe

Tags:

r

httr

jsonlite

I am trying to convert JSON pulled from an API into a data frame in R, so that I can use and analyze the data.

#Install needed packages
require(RJSONIO)
require(httr)

#request a list of companies currently fundraising using httr
r <- GET("https://api.angel.co/1/startups?filter=raising")
#convert to text object using httr
raise <- content(r, as="text")
#convert to list using RJSONIO
fromJSON(raise) -> new

Once I get this object, new, I am having a really difficult time parsing the list into a dataframe. The json has this structure:

{
  "startups": [
 {
  "id": 6702,
  "name": "AngelList",
  "quality": 10,
  "...": "...",
  "fundraising": {
    "round_opened_at": "2013-07-30",
    "raising_amount": 1000000,
    "pre_money_valuation": 2000000,
    "discount": null,
    "equity_basis": "equity",
    "updated_at": "2013-07-30T08:14:40Z",
    "raised_amount": 0.0
      }
    }
  ],
  "total": 4268 ,
  "per_page": 50,
  "page": 1,
  "last_page": 86
}

I've tried looking at individual elements within new using code like:

 new$startups[[1]]$fundraising$raised_amount

To pull the raised_amount for the first element listed. However, I don't know how to apply this to the whole list of 4268 startups. In particular, I can't figure out how to deal with the pagination. I only ever seem to get one page of startups (i.e. 50 of them) max.

I tried using a for loop to get the list of startups and just put each value into a row of a dataframe one by one. The example below shows this for just one column, but of course I could do it for all of them just by expanding the for loop. However, I can't get any content on any of the other pages.

df1 <- as.data.frame(1:length(new$startups))
df1$raiseamnt <- 0

for (i in 1:length(new$startups)) {
  df1$raiseamnt[i] <- new$startups[[i]]$fundraising$raised_amount
}

e: Thank you for the mention of pagination. I will look through the documents more carefully and see if I can figure out how to correctly structure the API calls to get different pages. I will update this answer if/when I figure that out!

like image 397
verybadatthis Avatar asked May 02 '15 01:05

verybadatthis


People also ask

How do I convert a JSON object to a DataFrame in R?

To convert a JSON file to a dataframe, you can use the as. data. frame() method. We simply use the fromJSON() function to read data from the data.

How do I load a JSON file in R?

To get JSON files into R, you first need to install or load the rjson package. Once you have installed and loaded this, you can use the fromJSON() function to read the json file.

What does fromJSON do in R?

The fromJSON() function reads the content in JSON format and de-serializes it into R objects. JSON content is produced of logical values, integers, real numbers, strings, arrays using the key: value pairs. The toJSON and fromJSON methods use class-based mapping.

How do I convert a list to a DataFrame in R?

Convert List to DataFrame using data. data. frame() is used to create a DataFrame in R that takes a list, vector, array, etc as arguments, Hence, we can pass a created list to the data. frame() function to convert list to DataFrame. It will store the elements in a single row in the DataFrame.


1 Answers

You may find the jsonlite package useful. Below is a quick example.

library(jsonlite)
library(httr)
#request a list of companies currently fundraising using httr
r <- GET("https://api.angel.co/1/startups?filter=raising")
#convert to text object using httr
raise <- content(r, as="text")
#parse JSON
new <- fromJSON(raise)

head(new$startups$id)
[1] 229734 296470 237516 305916 184460 147385

Note, however, this package or the one in the question can be of help to parse JSON string, individual structure should created appropriately so that each element of the string can be added without a problem and it is up to the developer.

For pagnation, the API seems to be a REST API so that filtering condition is normally added in the URL (eg https://api.angel.co/1/startups?filter=raising&variable=value). I guess it would be found somewhere in the API doc.

like image 173
Jaehyeon Kim Avatar answered Oct 07 '22 01:10

Jaehyeon Kim