I have the following data passed back from an API and I cannot change it's structure. I would like to convert the following JSON into a tibble.
data <- '{ "ids":{
"00000012664":{
"state":"Indiana",
"version":"10",
"external_ids":[
{
"db":"POL",
"db_id":"18935"
},
{
"db":"CIT",
"db_id":"1100882"
}
],
"id":"00000012520",
"name":"Joe Smith",
"aliases":[
"John Smith",
"Bill Smith"
]
},
"00000103162":{
"state":"Kentucky",
"external_ids":[
{
"db":"POL",
"db_id":"69131"
},
{
"db":"CIT",
"db_id":"1098802"
}
],
"id":"00000003119",
"name":"Sue Smith",
"WIP":98203059
} ,
"0000019223":{
"state":"Ohio",
"external_ids":[
{
"db":"POL",
"db_id":"69134"
},
{
"db":"JT",
"db_id":"615234"
}
],
"id":"0000019223",
"name":"Larry Smith",
"WIP":76532172,
"aliases":[
"Test 1",
"Test 2",
"Test 3",
"Test 4"
],
"insured":1
} } }'
Please Note: This is a small subset of the data and could have thousands of "ids".
I've tried jsonlite and tidyjson with a combination of purrr.
The following gives me a tibble, but I cannot figure out how to get aliases back.
obj <- jsonlite::fromJSON(data, simplifyDataFrame = T, flatten = F)
obj$ids %>% {
data_frame(id=purrr::map_chr(., 'id'),
state=purrr::map_chr(., 'state', ''),
WIP=purrr::map_chr(., 'WIP', .default=''),
#aliases=purrr::map(purrr::map_chr(., 'aliases', .default=''), my_fun)
)
}
I cannot figure out with tidyjson either:
data %>% enter_object(ids) %>% gather_object %>% spread_all
What I would like back is a tibble with the following fields (regardless if they are in the JSON or not.
id
name
state
version
aliases -> as a string comma separated
WIP
BONUS: ;-)
Can I get external_ids as a string as well?
Instead of extracting each element with multiple calls with map, an option is to convert to tibble with (as_tibble) and select the columns of interest, grouped by 'id' collapse the 'aliases' into a single string and get the distinct rows by 'id'
library(tibble)
library(purrr)
library(stringr)
map_dfr(obj$ids, ~ as_tibble(.x) %>%
select(id, one_of("name", "state", "version", "aliases", "WIP"))) %>%
group_by(id) %>%
mutate(aliases = toString(unique(aliases))) %>%
distinct(id, .keep_all = TRUE)
# A tibble: 2 x 6
# Groups: id [2]
# id name state version aliases WIP
# <chr> <chr> <chr> <chr> <chr> <int>
#1 00000012520 Joe Smith Indiana 10 John Smith, Bill Smith NA
#2 00000003119 Sue Smith Kentucky <NA> NA 98203059
If we also need the 'external_ids' (which is a data.frame)
map_dfr(obj$ids, ~ as_tibble(.x) %>%
mutate(external_ids = reduce(external_ids, str_c, sep = " "))) %>%
group_by(id) %>%
mutate_at(vars(aliases, external_ids), ~ toString(unique(.))) %>%
ungroup %>%
distinct(id, .keep_all= TRUE)
# A tibble: 2 x 7
# state version external_ids id name aliases WIP
# <chr> <chr> <chr> <chr> <chr> <chr> <int>
#1 Indiana 10 POL 18935, CIT 1100882 00000012520 Joe Smith John Smith, Bill Smith NA
#2 Kentucky <NA> POL 69131, CIT 1098802 00000003119 Sue Smith NA 98203059
For the new data, we can use
obj$ids %>%
map_dfr(~ map_df(.x, reduce, str_c, collapse = ", ", sep= " ") )
# A tibble: 3 x 8
# state version external_ids id name aliases WIP insured
# <chr> <chr> <chr> <chr> <chr> <chr> <int> <int>
#1 Indiana 10 POL 18935, CIT 1100882 00000012520 Joe Smith John Smith Bill Smith NA NA
#2 Kentucky <NA> POL 69131, CIT 1098802 00000003119 Sue Smith <NA> 98203059 NA
#3 Ohio <NA> POL 69134, JT 615234 0000019223 Larry Smith Test 1 Test 2 Test 3 Test 4 76532172 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With