Question

My knowledge of SAS is inexistent, and I usually work in R and Stata. Recently I downloaded a dataset that is publicly available from the Brazilian government, and for some reason they made it available in raw format with a SAS script to read it in:

DATA DOM (COMPRESS = YES);
INFILE "...¥T_DOMICILIO_S.txt" LRECL = 164 MISSOVER;     

INPUT  @001 TIPO_REG    $2.   /* TIPO DE REGISTRO  */
       @003 COD_UF      $2.   /* CモDIGO DA UF      */         
       @005 NUM_SEQ     $3.   /* NレMERO SEQUENCIAL */
       @008 NUM_DV      $1.   /* DV DO SEQUENCIAL  */                

…Etc etc…

RUN;

Is it possible to "translate this statement into an equivalent for r? If so, which function should I be looking for?

Was it helpful?

Solution

There's an app for that! Well an R package, anyway, SAScii, brought to you by the indomitable Anthony Damico. It has two functions: parse.SAScii and read.SAScii. I've used it with great success on US gummint CDC files.

install.packages("SAScii")
library(SAScii)

> parse.SAScii("test.sas")
   varname width char divisor
1 TIPO_REG     2 TRUE       1
2   COD_UF     2 TRUE       1
3  NUM_SEQ     3 TRUE       1
4   NUM_DV     1 TRUE       1
Warning message:
In readLines(sas_ri) : incomplete final line found on 'test.sas'

-- Then you will need to use read.SAScii for the second step, but you did not offer an appropriate test file for that test.

The input file, 'test.sas' was:

DATA DOM (COMPRESS = YES);
INFILE "...¥T_DOMICILIO_S.txt" LRECL = 164 MISSOVER;     

INPUT  @001 TIPO_REG    $2.   /* TIPO DE REGISTRO  */
       @003 COD_UF      $2.   /* CモDIGO DA UF      */         
       @005 NUM_SEQ     $3.   /* NレMERO SEQUENCIAL */
       @008 NUM_DV      $1.   /* DV DO SEQUENCIAL  */                

RUN;

If you view the "twotorials" on Youtube by Anthony Damico or go to his website you can see why I used the word "indomitable".

OTHER TIPS

the other responses to your question are better because they are more general. but you are asking specifically about ibge's pesquisa orçamentos familiares.. and i have already written code to import all of the 2002-2003 and 2008-2009 directly into R without further ado. :) just follow the directions at the top, run the download script, and everything will be loaded into R correctly.

https://github.com/ajdamico/usgsd/tree/master/Pesquisa%20de%20Orcamentos%20Familiares

http://www.asdfree.com/search/label/pesquisa%20de%20orcamentos%20familiares%20%28pof%29

SAS has many more input options than R, so sometimes it may be difficult to make direct translations; but you might consider looking at the SAScii package to help you create a call to read.fwf

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top