Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R read.spss error importing SPSS .por file - "Bad character in time"

Tags:

r

I'm trying to import the NYPD stop-and-frisk data into R. The data is in SPSS .por files at http://www.nyc.gov/html/nypd/downloads/zip/analysis_and_planning/YYYY.zip where YYYY is a year from 2003 to 2012

Most of the files load fine, but the 2004, 2007, and 2008 files all give me this error:

> library(foreign)
> mydata= read.spss("2004.por", to.data.frame=TRUE)
Error in read.spss("2004.por", to.data.frame = TRUE) : 
  error reading portable-file dictionary
In addition: Warning message:
In read.spss("2004.por", to.data.frame = TRUE) : Bad character in time
Execution halted

Any suggestions on how to debug this? I realize that read.spss does not support the latest SPSS versions, but given that most of the files (7 out of 10) import properly I wonder whether it's something more subtle.

psppire loads all the files without complaint, but the data looks corrupted, with some fields seemingly combined with others, and binary data in some of the fields.

like image 538
Captain Pedantic Avatar asked Dec 20 '13 06:12

Captain Pedantic


1 Answers

I had some success using memisc as recommended in Read SPSS file into R. Namely, after installing memisc:

> install.packages('memisc')

You can read the data rather easily:

> library(memisc)
> data <- as.data.set(spss.portable.file('2004.por'))

While I haven't thoroughly inspected the data, it appears on first glance to be right.

like image 87
icktoofay Avatar answered Dec 24 '22 03:12

icktoofay