I'm trying to import the NYPD stop-and-frisk data into R. The data is in SPSS .por files at http://www.nyc.gov/html/nypd/downloads/zip/analysis_and_planning/YYYY.zip where YYYY is a year from 2003 to 2012
Most of the files load fine, but the 2004, 2007, and 2008 files all give me this error:
> library(foreign)
> mydata= read.spss("2004.por", to.data.frame=TRUE)
Error in read.spss("2004.por", to.data.frame = TRUE) :
error reading portable-file dictionary
In addition: Warning message:
In read.spss("2004.por", to.data.frame = TRUE) : Bad character in time
Execution halted
Any suggestions on how to debug this? I realize that read.spss does not support the latest SPSS versions, but given that most of the files (7 out of 10) import properly I wonder whether it's something more subtle.
psppire loads all the files without complaint, but the data looks corrupted, with some fields seemingly combined with others, and binary data in some of the fields.
I had some success using memisc
as recommended in Read SPSS file into R. Namely, after installing memisc
:
> install.packages('memisc')
You can read the data rather easily:
> library(memisc)
> data <- as.data.set(spss.portable.file('2004.por'))
While I haven't thoroughly inspected the data, it appears on first glance to be right.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With