Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

On the fly anonymisation of a MySQL dump

I am using mysqldump to create DB dumps of the live application to be used by developers.

This data contains customer data. I want to anonymize this data, i.e. remove customer names / credit card data.

An option would be:

  • create copy of database (create dump and import dump)
  • fire SQL queries that anonymize the data
  • dump the new database

But this has to much overhead. A better solution would be, to do the anonymization during dump creation.

I guess I would end up parsing all the mysqlsqldump output? Are there any smarter solutions?

like image 575
Alex Avatar asked Jan 07 '13 15:01

Alex


3 Answers

You can try Myanon: https://myanon.io

Anonymization is done on the fly during dump:

mysqldump | myanon -f db.conf | gzip > anon.sql.gz
like image 119
Pierre POMES Avatar answered Sep 21 '22 02:09

Pierre POMES


Why are you selecting from your tables if you want to randomize the data?

Do a mysqldump of the tables that are safe to dump (configuration tables, etc) with data, and a mysqldump of your sensitive tables with structure only.

Then, in your application, you can construct the INSERT statements for the sensitive tables based on your randomly created data.

like image 36
Colin M Avatar answered Sep 20 '22 02:09

Colin M


I had to develop something similar few days ago. I couldn't do INTO OUTFILE because the db is AWS RDS. I end up with that approach:

Dump data in tabular text form from some table:

mysql -B -e 'SELECT `address`.`id`, "address1" , "address2", "address3", "town", "00000000000" as `contact_number`, "[email protected]" as `email` FROM `address`' some_db > addresses.txt

And then to import it:

mysql --local-infile=1 -e "LOAD DATA LOCAL INFILE 'addresses.txt' INTO TABLE \`address\` FIELDS TERMINATED BY '\t' ENCLOSED BY '\"' IGNORE 1 LINES" some_db

only mysql command is required to do this.

As the export is pretty quick (couple of seconds for ~30.000 rows), the import process is a bit slower, but still fine. I had to join few tables on the way and there was some foreign keys so it will surely be faster if you don't need that. Also if you disable foreign key checks while importing it will also speed up things.

like image 43
matiangul Avatar answered Sep 21 '22 02:09

matiangul