I need to take production data with real customer info (names, address, phone numbers, etc) and move it into a dev environment, but I'd like to remove any semblance of real customer info.
Some of the answers to this question can help me generating NEW test data, but then how do I replace those columns in my production data, but keep the other relevant columns?
Let's say I had a table with 10000 fake names. Should I do a cross-join with a SQL update? Or do something like
UPDATE table SET lastname = (SELECT TOP 1 name FROM samplenames ORDER By NEWID())
Data anonymization is done by creating a mirror image of a database and implementing alteration strategies, such as character shuffling, encryption, term, or character substitution. For example, a value character may be replaced by a symbol such as “*” or “x.” It makes identification or reverse engineering difficult.
Data anonymization seeks to protect private or sensitive data by deleting or encrypting personally identifiable information from a database. Data anonymization is done for the purpose of protecting an individual's or company's private activities while maintaining the integrity of the data gathered and shared.
Data anonymization is the process of protecting private or sensitive information by erasing or encrypting identifiers that connect an individual to stored data.
Anonymizing data can be tricky and if not done correctly can lead you to trouble, like what happened to AOL when they released search data a while back. I would attempt to create test data from scratch at all costs before I tried to convert existing customer data. Things may lead you to be able to figure out who the data belonged to using things such as behavioral analysis and other data points that you might not consider sensitive. I would rather be safe than sorry.
This is easier than it sounds if you understand the database. One thing that is necessary is to understand the places where personal info is not normalized. For instance, the customer master file will have a name and address, but the order file will also have a name and address that might be different.
My basic process:
It doesn't look pretty, but it works.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With