I have a JSON data source providing a list of hashes:
[
{ "a": "foo",
"b": "sdfshk"
},
{ "a": "foo",
"b": "ihlkyhul"
}
]
I use fromJSON()
in the rjson
package to convert that to an R data structure. It returns:
list(
structure(list(a = "foo", b = "sdfshk"), .Names = c("a", "b")),
structure(list(a = "foo", b = "ihlkyhul"), .Names = c("a", "b"))
)
I need to get this into an R data frame, but data.frame()
turns that into a single-row data frame with four columns instead of a 2x2 data frame as expected. I lack the R-fu to do the transform from one to the other, though it looks like it should be straightforward.
Bonus points:
The actual problem is a bit more complex, because the JSON data source isn't as regular as I show above. The objects it returns vary in type. That is, the field set in each can be one of a few different types:
[
{ "a": "foo",
"b": "asdfhalsdhfla"
},
{ "a": "bar",
"c": "akjdhflakjhsdlfkah",
"d": "jfhglskhfglskd",
},
{ "a": "foo",
"b": "dfhlkhldsfg"
}
]
As you can see, the "a" field in each object is a type tag, indicating which other fields the object will have.
I'm not too particular how the solution copes with this.
It wouldn't be horrible if the two object types were just mooshed together, so you get columns a, b, c, and d, and the rows simply have N/A
or NULL
values where the JSON source object doesn't have a value for a given field. I believe I can clean the resulting data frame with subset(df, a == "foo")
. I'll end up with some empty columns that way, but it won't matter to my program.
It would be better if the solution provides a way to select which JSON source rows go into the data frame and which get rejected, so the result has only the columns and rows actually required.
If you have a jagged list you want converted to a data.frame, you could use Hadley's plyr's rbind.fill
. Saved my neck on a couple of occasions. Let me know if this is what you're looking for. Notice that I modified your first example to include only "b" in the third element to make it jagged.
> x <- list(
+ structure(list(a = "foo", b = "sdfshk"), .Names = c("a", "b")),
+ structure(list(a = "foo", b = "ihlkyhul"), .Names = c("a", "b")),
+ structure(list(b = "asdf"), .Names = "b")
+ )
>
> library(plyr)
> do.call("rbind.fill", lapply(x, as.data.frame))
a b
1 foo sdfshk
2 foo ihlkyhul
3 <NA> asdf
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With