I have a dataset that has one concatenated text field. I am trying to break it into three text columns in SAS 9.4.
Obs Var1
1 MAY12-KANSAS-ABCD6194-XY7199-BRULE
2 JAN32-OHIO-BZ5752-GARY
My output for observation 1 should look like this:
Obs Date State ID
1 MAY12 KANSAS ABCD6194-XY7199-BRULE
Here's what I have, which works for the Date and State. However, I can't get the third part (ID) to ignore the delimiter:
data have;
input Var1 &$64.;
cards;
MAY12-KANSAS-ABCD6194-XY7199-BRULE
JAN32-OHIO-BZ5752-GARY
;
run;
data need;
length id $16;
set have;
date = scan(var1,1,'-','o');
state = scan(var1,2,'-','o');
id = scan(var1,3,'-','');
run;
Another different approach to get a multi-delimiter-containing word is to use call scan
. It will tell you the position (and length, which we ignore) of the nth word (and can be forwards OR backwards, so this gives you some ability to search in the middle of the string).
Implemented for this particular case it's very simple:
data want;
set have;
length
date $5
state $20
id $50
;
date = scan(var1,1);
state= scan(var1,2);
call scan(var1,3,position,length);
id = substr(var1,position);
run;
position
and length
are variables that call scan
populates with values which we can then use. (We could have done date and state that way also, but it's more work than using the function.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With