Question

I have .csv with line break and want to import in to SAS, But am facing the problems with data having like CUSTOMER with space (wrap text). Please help me how to overcome from this problem, Similar way I have some other variables, If I import mannualy its working fine.Please find the example below. See SLN PJ0136 to know the problem.

SLN     MOD PM  NE      CUSTOMER
32121   GG  1   1   AVAILABLE UPON REQUEST
71403   EN  1   0   JET SUPPORT SERVICE INC.
305173  EN  1   1   UNKNOWN / COTTONWOOD, LLC / J SUPPORT SERVICE, INC.
PJ0136  PS  0   0   "UNKNOWN / GROUP B-50 INC AA
                    TC0004   anada CSC Europe
                    Inglewood Ava" 
EB0162  RG  0   0   ATR

I used infile to import

DATA WORK.test1;
%let _EFIERR_ = 0; 
INFILE 'C:\Users\26631.IELPWC\Downloads\test.csv'
       delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;

    INFORMAT
        SLN  $CHAR6. MOD $CHAR2. PM  BEST1.  NE BEST1. CUSTOMER $CHAR82. ;
    FORMAT
        SLN  $CHAR6.  MOD  $CHAR2. PM  BEST1.  NE   BEST1. CUSTOMER  $CHAR82. ;
    INPUT
        SLN $  MOD $  PM NE CUSTOMER $ ;

   if _ERROR_ then call symputx('_EFIERR_',1);
RUN;

Please see the wrong output

32121   GG  1   1   AVAILABLE UPON REQUEST
71403   EN  1   0   JET SUPPORT SERVICE INC.
305173  EN  1   1   UNKNOWN / COTTONWOOD, LLC / J SUPPORT SERVICE, INC.
PJ0136  PS  0   0   "UNKNOWN / GROUP B-50 INC AA
TC0004      .   .   
24719       .   .   
"       .   .   
EB0162  RG  0   0   ATR
Était-ce utile?

La solution

Assuming that your input data is in the following format:

SLN,MOD,PM,NE,CUSTOMER
32121,GG,1,1,AVAILABLE UPON REQUEST
71403,EN,1,0,JET SUPPORT SERVICE INC.
305173,EN,1,1,"UNKNOWN / COTTONWOOD, LLC / J SUPPORT SERVICE, INC."
PJ0136,PS,0,0,"UNKNOWN / GROUP B-50 INC AA
TC0004   anada CSC Europe
Inglewood Ava"
EB0162,RG,0,0,ATR

The following SAS code will produce required output:

data TEST (drop=_TMP_:);
  format SLN $6. MOD $2. PM 8. NE 8. CUSTOMER $82. _TMP_STR $100.;
  infile 'input.csv' truncover firstobs=2 dlm=',' dsd lrecl=10000;
  input SLN MOD PM NE _TMP_STR @;
  _TMP_COUNT=0;
  do until(mod(_TMP_COUNT, 2) = 0);
    CUSTOMER=catx('0A'x, CUSTOMER, _TMP_STR);
    _TMP_COUNT=_TMP_COUNT + countc(_TMP_STR, '"');
    if mod(_TMP_COUNT, 2) then do;
      input _TMP_STR;
    end;
  end;
  CUSTOMER=dequote(CUSTOMER);
run;

Please note that the value for CUSTOMER column where SLN='PJ0136' is multiline (Unix style). You can remove this by changing function catx(...) acordingly.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top