Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scraping a table from OECD

I'm trying to scrape a table from https://data.oecd.org/unemp/unemployment-rate.htm and my table in specific https://data.oecd.org/chart/66NJ. I want to scrape the months at the top and all the values in the rows 'OECD - Total' and 'The Netherlands'

After trying many different code and searching on this and other forums I just can't figure out how to scrape from this table. I have tried many different html codes found via selector gadget or inspecting an element in my browser but keep getting 'list of 0' or 'character empty'

Any help would be appreciated.

library(tidyverse)
library(rvest)
library(XML)
library(magrittr)

#Get element data from one page
url<-"https://stats.oecd.org/sdmx-json/data/DP_LIVE/.HUR.TOT.PC_LF.M/OECD?json-lang=en&dimensionAtObservation=allDimensions&startPeriod=2016-08&endPeriod=2020-07"
  
#scrape all elements
content <- read_html(url)

#trying to load in a table (giveslist of 0)
inladentable <- readHTMLTable(url)

#gather al months (gives charahter 'empty')
months <- content %>% 
  html_nodes(".table-chart-sort-link") %>%
  html_table()
  
#alle waarden voor de rij 'OECD - Total' verzamelen
wwpercentage<- content %>% 
  html_nodes(".table-chart-has-status-e") %>%
  html_text()
  
# Combine into a tibble
wwtable <- tibble(months=months,wwpercentage=wwpercentage)
like image 265
Mitchziie Avatar asked Sep 13 '25 01:09

Mitchziie


1 Answers

This is JSON and not HTML.
You can query it using httr and jsonlite:

library(httr)
res <- GET("https://stats.oecd.org/sdmx-json/data/DP_LIVE/.HUR.TOT.PC_LF.M/OECD?json-lang=en&dimensionAtObservation=allDimensions&startPeriod=2016-08&endPeriod=2020-07")
res <- jsonlite::fromJSON(content(res,as='text'))
res 

#> $header
#> $header$id
#> [1] "98b762f3-47aa-4e28-978a-a4a6f6b3995a"
#> 
#> $header$test
#> [1] FALSE
#> 
#> $header$prepared
#> [1] "2020-09-30T21:58:10.5763805Z"
#> 
#> $header$sender
#> $header$sender$id
#> [1] "OECD"
#> 
#> $header$sender$name
#> [1] "Organisation for Economic Co-operation and Development"
#> 
#> 
#> $header$links
#>                                                                                                                                                              href
#> 1 https://stats.oecd.org:443/sdmx-json/data/DP_LIVE/.HUR.TOT.PC_LF.M/OECD?json-lang=en&dimensionAtObservation=allDimensions&startPeriod=2016-08&endPeriod=2020-07
#>       rel
#> 1 request
#> 
#> 
#> $dataSets
#>        action observations.0:0:0:0:0:0 observations.0:0:0:0:0:1
#> 1 Information   5.600849, 0.000000, NA   5.645914, 0.000000, NA
...
like image 80
Waldi Avatar answered Sep 14 '25 14:09

Waldi