Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Directly loading .RData from github

Tags:

url

github

r

load

I want to load PakPMICS2018bh.RData data from https://github.com/myaseen208/PakPMICS2018Data/ and used the following code which throws the error:

library(RCurl)
PakPMICS2018bhURL <- "https://github.com/myaseen208/PakPMICS2018Data/raw/master/PakPMICS2018bh.RData"
load(url(PakPMICS2018bhURL))

Error in load(url(PakPMICS2018bhURL)) : the input does not start with a magic number compatible with loading from a connection

I am wondering what is wrong with my code. Any help, please.

like image 933
MYaseen208 Avatar asked Oct 16 '22 14:10

MYaseen208


2 Answers

The problem is not in your code, it should work fine. This, for example, loads an Rdata file from github normally

load(url("https://github.com/mawp/spict/raw/master/spict/data/pol.rda"))

Your problem comes from the files you are trying to open, they are saved with the serialization format 3 that was introduced in R version 3.5, using save(version = 3)

R has new serialization format (version 3) which supports custom serialization of ALTREP framework objects. These objects can still be serialized in format 2, but less efficiently. Serialization format 3 also records the current native encoding of unflagged strings and converts them when de-serialized in R running under different native encoding. Format 3 comes with new serialization magic numbers (RDA3, RDB3, RDX3). Format 3 can be selected by version = 3 in save(), serialize() and saveRDS(), but format 2 remains the default for all serialization and saving of the workspace. Serialized data in format 3 cannot be read by versions of R prior to version 3.5.0.

EDIT

After some more research I think it is a bug (or a feature ?). For files saved with compression argument equal to FALSE, TRUE or gz the code works as expected in R version >= 3.5. But for compression equal to xz which seems to be your case it does not work.

There are two options: either save the files with gz compression, or use the workaround from @user113156's answer.

like image 177
alko989 Avatar answered Oct 19 '22 00:10

alko989


You can try this:

Just make sure you set your working directory.

setwd("SET YOUR Working Directory - the file will download here")
working_directory <- getwd()
if (!file.exists("PakPMICS2018bh.RData")) {
  download.file(   "https://github.com/myaseen208/PakPMICS2018Data/raw/master/PakPMICS2018bh.RData",   "PakPMICS2018bhURL.RData")

  load(file.path(working_directory, "PakPMICS2018bhURL.RData"))
  } 
like image 38
user113156 Avatar answered Oct 19 '22 01:10

user113156