Question

I am trying to read an SPSS file in to R using read.spss. It is a very large file (The World Values Survey), with about 67k entries.

Here is the code, with the errors:

> library(foreign)
> wvs = read.spss("C:/wvs2005_v20090901a.sav",to.data.frame=TRUE)
Warning messages:
1: In read.spss("C:/wvs2005_v20090901a.sav", to.data.frame = TRUE) :
C:/wvs2005_v20090901a.sav: Unrecognized record type 7, subtype 8 encountered in system file
2: In `levels<-`(`*tmp*`, value = c("Missing; Unknown", "Not asked",  :
duplicated levels will not be allowed in factors anymore
3: In `levels<-`(`*tmp*`, value = c("Missing; Unknown", "Not asked",  :
duplicated levels will not be allowed in factors anymore
4: In `levels<-`(`*tmp*`, value = c("Missing; Unknown", "Not asked",  :
duplicated levels will not be allowed in factors anymore
5: In `levels<-`(`*tmp*`, value = c("Missing; Unknown", "Not asked",  :
duplicated levels will not be allowed in factors anymore
6: In `levels<-`(`*tmp*`, value = c("Missing; Unknown", "Not asked",  :
duplicated levels will not be allowed in factors anymore
7: In `levels<-`(`*tmp*`, value = c("Missing; Unknown", "Not asked",  :
duplicated levels will not be allowed in factors anymore
8: In `levels<-`(`*tmp*`, value = c("Missing; Unknown", "Not asked",  :
duplicated levels will not be allowed in factors anymore
9: In `levels<-`(`*tmp*`, value = c("Missing; Unknown", "Not asked",  :
duplicated levels will not be allowed in factors anymore

Any insight is much appreciated.

Was it helpful?

Solution

Did you try a different function to read in spss? I found two:

from ?read.spss A different interface also based on the PSPP codebase is available in package ‘memisc’: see its help for ‘spss.system.file’.

Also, in package Hmisc, there is a function spss.get, which provides "Enhanced Importing of SPSS files".

I recommend trying Hmisc::spss.get first.

OTHER TIPS

Recent versions of R have changed in a way that causes the error about duplicated values to be issued.

SPSS Statistics allows more than one value to have the same value label (generally you don't want to do this, but occassionally it is useful). R, when converting variables to factors, may use the value labels to define the factor levels, and that causes this type of message.

If you add use.value.labels=FALSE in your read.spss call, you won't get this message. Of course, then, you will need to make the factors yourself, perhaps using level= instead of labels= in factor().

You may still get warning messages about unknown record 7 subtypes. R packages don't know how to interpret all the record 7 information, so it will just be lost. In many cases that is harmless, but you should double check your data to be sure.

SPSS Statistics can run R code, and it provides apis that will transfer data between Statistics and R correctly.

HTH, Jon Peck

I just remembered: Often when I try to read in an SPSS file I get the same error but I go ahead and still recall the object named by read.spss and somehow everything's OK.

I'm guessing you haven't tried clicking on the object you called "wvs"

Again try what I suggested before but then call wvs as I have below:

wvs <- read.spss("C:/wvs2005_v20090901a.sav", use.value.labels = FALSE,
           to.data.frame=TRUE)
head(wvs)

I edited from "wvs" to "head(wvs)" because the file is very large.

I had the exact same issue with data from the ESS site (European Social Survey), and solved it following a hint in read.spss help. Using package memisc instead, you can import a portable SPSS file like this:

data <- as.data.set(spss.portable.file("filename.por"))

Similarly, for .sav files:

data <- as.data.set(spss.system.file('filename.sav'))

although in this case I seem to miss some string values, while the portable import works seamlessly. The help page for spss.portable.file claims:

The importer mechanism is more flexible and extensible than read.spss and read.dta of package "foreign", as most of the parsing of the file headers is done in R. They are also adapted to load efficiently large data sets. Most importantly, importer objects support the labels, missing.values, and descriptions, provided by this package.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top