I am using mysql. Some of the tables contain sensitive data like user names, email addresses, etc. I want to dump the data but with these columns in the table removed or modified to some fake data. Is there any way to do it easily?
I'm using this approach:
--ignore-table
arguments to mysqldump.exe to leave the original tables out.It preserves foreign key contraints, and you can keep columns that are not sensitive.
The first two actions are contained in a stored procedure that I call before doing the dump. It looks something like this:
BEGIN
truncate table person_anonymous;
insert into person_anonymous select * from person;
update person_anonymous set Title=null, Initials=mid(md5(Initials),1,10), Midname=md5(Midname), Lastname=md5(Lastname), Comment=md5(Comment);
END
As you can see, I'm not clearing the contents of the fields. Instead, I keep a hash. That way, you can still see which rows have the same value, and between exports you can see if something changed or not, without anyone being able to read the actual values.
There is a tool called Jailer that is typically used to export a subset of a database. We use this at work to create a smaller test database from a production backup, with all sensitive data obfuscated.
The GUI is a bit crude, but Jailer is the best alternative I have found so far. You can simply unselect the sensitive tables or columns and get a full copy of the rest. Jailer also supports obfuscating data during export - you could for instance md5 hash all user names or change all email addresses to [email protected].
There is a tutorial to get you started.
ProxySQL is another approach.
Here is an article explaining how to obfuscate data with proxysql.
https://proxysql.com/blog/obfuscate-data-from-mysqldump
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With