I have a package that scrapes data from the internet and displays its content based on the function call. But recently I got a message from CRAN that the data becomes stale when Binary build is installed (since the function was mentioned in utils.R
and it has downloaded while the build).
For the past few days, I've tried the following but no success:
<<-
but it generates a CRAN note and I also went through a few answers which advised against the approach
Note: no visible binding for global variable
This is the current package files: https://github.com/amrrs/tiobeindexr/tree/master/R
Tried solution:
zzz.r
file:
.onLoad <- function (libname, pkgname)
{
assign("newEnv", new.env(hash = TRUE, parent = parent.frame()))
newEnv$.all_tablesx789 <- rvest::html_table(xml2::read_html('https://www.tiobe.com/tiobe-index/'))
}
one of the functions in the core code.
hall_of_fame <- function() {
#check_data()
#.GlobalEnv$.all_tablesx789 <- check_data()
newEnv$.all_tablesx789[[4]]
}
The package builds fine, but the object is not found. Error below:
Error in hall_of_fame() : object 'newEnv' not found
I've only a couple of days to save my package on CRAN and I hope I've provided enough data from saving this question being downloaded.
Thanks!
Consider adding memoise
as a dependency so you can get in-session caching for free with a minimal dependency chain then using a package environment and (just for fun) an active binding.
Create new 📦 env (you can stick this in, say, aaa.R
):
.pkgenv <- new.env(parent=emptyenv())
Now, (say, in zzz.R
) setup one function that does the table grabbing:
.get_tiboe_tables <- function(url) {
message("Delete this since it's just to show caching works") # delete this
content <- xml2::read_html(url)
rvest::html_table(content)
}
And "memoise" it (again, in zzz.R
):
get_tiboe_tables <- memoise::memoise(.get_tiboe_tables)
Now, create an active binding which will let us access the tables like a variable (i.e. w/o the ()
). It's more "fun" than necessary (again, in zzz.R
):
makeActiveBinding(
sym = "all_tables",
fun = function() get_tiboe_tables('https://www.tiobe.com/tiobe-index/'),
env = .pkgenv
)
Now, get the value like this (notice we get the "loading" message as it "primes" the cache:
str(.pkgenv$all_tables, 1)
## Delete this since it's just to show caching works ** the loading msg
## List of 4
## $ :'data.frame': 20 obs. of 6 variables:
## $ :'data.frame': 30 obs. of 3 variables:
## $ :'data.frame': 15 obs. of 8 variables:
## $ :'data.frame': 15 obs. of 2 variables:
On subsequent calls there is no loading message since it's retrieving the cached value:
str(.pkgenv$all_tables, 1)
## List of 4
## $ :'data.frame': 20 obs. of 6 variables:
## $ :'data.frame': 30 obs. of 3 variables:
## $ :'data.frame': 15 obs. of 8 variables:
## $ :'data.frame': 15 obs. of 2 variables:
On the next R session it will refresh the tables. That way, there's fresh data without abusing the site. You can use file collation instead of sorted-name hacking as well.
Note that you can export the active binding as well and your 📦 users can then use it like a variable instead of calling it like a function.
Actually, I took a slightly different approach from the above answer. This is in reference with Thomas' comment and the reason is I didn't want to add memoise
as a dependency and tried an alternative.
aaa.R
:.pkgenv <- new.env(parent=emptyenv())
.onAttach()
in zzz.R
.onAttach <- function(libname, pkgname) {
packageStartupMessage("Downloading TIOBE Index Data using your Internet...")
tryCatch({
.pkgenv$.get_tiboe_tables <- rvest::html_table(xml2::read_html("https://www.tiobe.com/tiobe-index/"))
},
error = function(e){
packageStartupMessage("Downloading TIOBE Index data failed!")
packageStartupMessage("Error Message:")
packageStartupMessage(e)
return(NA)
})
}
My earlier mistakes seems that I was trying to create the new enviroment inside .onLoad()
itself.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With