Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what some ways to mask a mysqldump?

anybody knows efficiency in masking(anonymization) of some tables in a mysqldump? I have already finished my parser but unfortunately it doesn't work that good on big dumps (let say a dump of 1GB++) because it really increases the dump time due to the parsing.

what I did was parse the table columns first (which shouldn't take long) and parse the whole insert string for a specific table.

I am using ruby and would like to use it if possible.

I also looked into the idea of exporting the dump, dumping it, updating (masking) it through internal ruby code then exporting the dump again. Although I haven't tried how long this is going to take.

The current workflow for this would be: get dump from a server, uncompress, then dump into mysql

the new one would be get dump from a server, uncompress, masked confidential data and dump into mysql

the current workflow would take at most 2 hours for a 1-2GB++ dump but unfortunately i already spent 4hrs on the new one but it is still not finished on the parsing/masking part.

I was also advised to improvise the code by taking out variables and things that consumes more memory since the ruby gc is said to be not on a 1:1 ratio. I believe this is optimized on REE(ruby enterprise edition) but I am currently using REE also now.

Has anybody done this and maybe share their thoughts? Thanks.

like image 386
poymode Avatar asked Dec 17 '22 19:12

poymode


2 Answers

Years later, but might be useful for future searches (like mine). What you can do, if your structure doesn't change all the time, is to abuse the custom where function of mysqldump to inject SQL.

For example:

mysqldump -options -w "0=1 union select field1, 'constant',
anonymize(field3) from table" database table

This will, for a three columns table, do a dump with the first column untouched, the second set to some constant value and the third mangled with an arbitrary function.

like image 147
bartavelle Avatar answered Dec 28 '22 08:12

bartavelle


You can specify tables that you don't want to dump: http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html#option_mysqldump_ignore-table

like image 29
zerkms Avatar answered Dec 28 '22 07:12

zerkms