Change the encoding of SPSS data files

https://stackoverflow.com/questions/19121954

30-06-2022
|

Question

I have a mixture of SPSS data files where part is in English (windows-1252) encoding and other part is in Turkish (windows-1254) encoding. Is it possible to change the encoding of a SPSS file? For example, I would like to change the encoding of all data files to Turkish (windows-1254).

I know the SET LOCALE command. I can change a locale before opening file. But it is not possible to change a locale during a data file is open. So I do not see an option to save a data file in a different encoding using SPSS syntax.

* Change SPSS locale to Turkish (windows-1254).
* Only for SPSS 13.0 and above.
new file.
set locale = tr_TR.
show locale.

* Change SPSS locale to English (windows-1252).
* Only for SPSS 13.0 and above.
new file.
set locale = en_US.
show locale.

There are variable and value labels defined. They should be kept.

Unicode is not an option here. The data files have to be compatible with old SPSS versions below version 16 (when Unicode support was introduced in SPSS). Anyway I have no clue how it would be possible to re-encode all data files in Unicode.

Solution

You can't have a file with a mixed encoding. If you happen to only have characters that don't conflict in the two code pages, you might be able to get away without Unicode, but it would be iffy.

What you could try is to set Unicode on and your locale to one of the two. Read the file and save it. It will be in Unicode. Then close the data, set the other locale, and read the second file. Save it. Now merge the two files. You have a Unicode file that has all the characters correct. Open Statistics, set Unicode off, and set the one locale you want to make primary, and open the merged file. If you are lucky, all the characters will work, but if there are conflicts, Unicode would be your only option.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow