Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SAS: Extract ID's separated by dashes from text string

Tags:

delimiter

sas

I have a dataset that has one concatenated text field. I am trying to break it into three text columns in SAS 9.4.

Obs Var1
1   MAY12-KANSAS-ABCD6194-XY7199-BRULE
2   JAN32-OHIO-BZ5752-GARY

My output for observation 1 should look like this:

Obs   Date   State   ID
1     MAY12  KANSAS  ABCD6194-XY7199-BRULE

Here's what I have, which works for the Date and State. However, I can't get the third part (ID) to ignore the delimiter:

data have;
   input Var1 &$64.;
   cards;
MAY12-KANSAS-ABCD6194-XY7199-BRULE
JAN32-OHIO-BZ5752-GARY
;
run;
data need;
   length id $16;
   set have;
   date = scan(var1,1,'-','o');
   state = scan(var1,2,'-','o');
   id = scan(var1,3,'-','');
run;
like image 494
Ryan Avatar asked Feb 10 '23 21:02

Ryan


1 Answers

Another different approach to get a multi-delimiter-containing word is to use call scan. It will tell you the position (and length, which we ignore) of the nth word (and can be forwards OR backwards, so this gives you some ability to search in the middle of the string).

Implemented for this particular case it's very simple:

data want;
  set have;
  length 
    date $5
    state $20
    id $50
  ;
  date = scan(var1,1);
  state= scan(var1,2);
  call scan(var1,3,position,length);
  id = substr(var1,position);
run;

position and length are variables that call scan populates with values which we can then use. (We could have done date and state that way also, but it's more work than using the function.)

like image 187
Joe Avatar answered Feb 19 '23 23:02

Joe