I'm just learning SAS. This is a pretty simple question -- I'm probably overthinking it.
I have a data set called people_info and one of the variables is SocialSecurityNum. I have another table called invalid_ssn with a single variable: unique and invalid SocialSecurityNum observations.
I would like to have a DATA step (or PROC SQL step) that outputs to invalid_people_info if the SocialSecurityNum of the person (observation) matches one of the values in the invalid_ssn table. Otherwise, it will output back to people_info.
What's the best way to do this?
Edit: More info, to clarify...
people_info looks like this:
name SocialSecurityNum
joe 123
john 456
mary 876
bob 657
invalid_ssn looks like this:
SocialSecurityNum
456
876
What I want is for people_info to change (in place) and look like this:
name SocialSecurityNum
joe 123
bob 657
and a new table, called invalid_people_info to look like this:
name SocialSecurityNum
john 456
mary 876
The data step shown by Hong Ooi is great, but youou could also do this with proc sql without the need to sort first and also without actually doing a full merge.
proc sql noprint;
create table invalid_people_info as
select *
from people_info
where socialsecuritynum in (select distinct socialsecuritynum from invalid_ssn)
;
create table people_info as
select *
from people_info
where socialsecuritynum not in (select distinct socialsecuritynum from invalid_ssn)
;
quit;
This simply selects all rows where ssn is (not) in the distinct list of invalid ssn's.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With