Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - Importing ASCII data using a .sas dictionary file and SAScii

Tags:

r

ascii

I recently download some data in ASCII format that came with SAS setup files which I would like to use with R. One such data file is here:

https://dl.dropboxusercontent.com/u/8474088/Data.txt

with corresponding SAS setup file here:

https://dl.dropboxusercontent.com/u/8474088/Setup.sas

I should note that the setup file is designed to work with around 50 different data files all with similar structure (the link above is an example of one of these).

I thought I was in good shape after finding the SAScii package but have been unable to get read.SAScii or parse.SAScii to work with these files. Either command gives an error.

read.SAScii(data.file,setup.file,beginline=581)

Error in if (as.numeric(x[j, "start"]) > as.numeric(x[j - 1, "end"]) +  : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
NAs introduced by coercion 

parse.SAScii(setup.file,beginline=581)

Error in if (as.numeric(x[j, "start"]) > as.numeric(x[j - 1, "end"]) +  : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
NAs introduced by coercion 

The examples given in the SAScii documentation use much simpler setup files so I am wondering if the complexity of the above file is causing the issue (for example the information on VALUE listed in the file prior to the INPUT command).

Any thoughts on how to proceed would be great. Thanks in advance.

like image 284
johnson-shuffle Avatar asked May 23 '13 22:05

johnson-shuffle


1 Answers

as noted in the details section of the parse.SAScii help, this package cannot read overlapping columns.. and your file clearly has 'em. ;) in order for SAScii to work, you'll have to break the .sas file into four separate .sas files on your hard drive. here's how-

# load all necessary libraries
library(stringr)
library(SAScii)
library(downloader)

# create two temporary files
tf <- tempfile()
tf2 <- tempfile()

# download the sas import script
download( "https://dl.dropboxusercontent.com/u/8474088/Setup.sas" , tf )

# download the actual data file
download( "https://dl.dropboxusercontent.com/u/8474088/Data.txt" , tf2 )

# read the sas importation instructions into R
z <- readLines( tf )

# here are the break points
z[ substr( str_trim( z ) , 1 , 1 ) == '#' ]

sas.script.breakpoints <- which( substr( str_trim( z ) , 1 , 1 ) == '#' )

script.one <- z[ 581:sas.script.breakpoints[1] ]
script.two <- z[ sas.script.breakpoints[1]:sas.script.breakpoints[2] ]
script.three <- z[ sas.script.breakpoints[2]:sas.script.breakpoints[3] ]
script.four <- z[ sas.script.breakpoints[3]:length(z) ]

# replace some stuff so these look like recognizable sas scripts
script.one[ length( script.one ) ] <- ";"

script.two[ 1 ] <- "input blank 1-300"
script.two[ length( script.two ) ] <- ";"

script.three[ 1 ] <- "input blank 1-300"
script.three[ length( script.three ) ] <- ";"

script.four[ 1 ] <- "input blank 1-300"

# test then import data set one
writeLines( script.one , tf )
parse.SAScii( tf )
x1 <- read.SAScii( tf2 , tf )

# test then import data set two
writeLines( script.two , tf )
parse.SAScii( tf )
x2 <- read.SAScii( tf2 , tf )

# test then import data set one
writeLines( script.three , tf )
parse.SAScii( tf )
x3 <- read.SAScii( tf2 , tf )

# test then import data set four
writeLines( script.four , tf )
parse.SAScii( tf )
x4 <- read.SAScii( tf2 , tf )
like image 168
Anthony Damico Avatar answered Oct 04 '22 06:10

Anthony Damico