Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to encrypt and decrypt pandas dataframe with decryption key?

I have a df with 300 columns but there is one column ID that I want to encrypt and allow anyone else with a key to decrypt if I give them the df as a csv.

Is this possible?

I know how to hash a column, but as far as I have read I can not unhash it or give someone a key to unhash it.

Thank you in advance.

edit:

df

id
1
2
3

@Wen is this a good example:

(1:2), (2:3),(3:4)

new df

id
2
3
4
like image 597
RustyShackleford Avatar asked Aug 31 '18 13:08

RustyShackleford


3 Answers

I'd recommend the python itsdangerous library. Here is a quick example:

from itsdangerous import URLSafeSerializer

s = URLSafeSerializer('secret-key')

print(s.dumps([1, 2, 3, 4]))

# 'WzEsMiwzLDRd.wSPHqC0gR7VUqivlSukJ0IeTDgo'

print(s.loads('WzEsMiwzLDRd.wSPHqC0gR7VUqivlSukJ0IeTDgo'))

# [1, 2, 3, 4]

The secret-key can be shared between you and the other trusted party to decrypt the strings or columns.

This does rely on serialization however and some python data types aren't easily serialized, but if you just need a column name or something like that, this could work well.

I would like to add a qualification here that this process only obfuscates the data, but does not actually encrypt it. I did not fully understand that when I originally answered this question. This obfuscation may be enough for your needs, but please be aware! From the docs:

The receiver can decode the contents and look into the package, but they can not modify the contents unless they also have your secret key. Docs

like image 66
Dan Safee Avatar answered Oct 17 '22 05:10

Dan Safee


You can use cryptpandas.

As an example, if you have a pandas dataframe

import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3],
                   'B': ['one', 'one', 'four']})

you can encrypt it as

import cryptpandas as crp

crp.to_encrypted(df, password='mypassword123', path='file.crypt')

and decrypt it as

decrypted_df = crp.read_encrypted(path='file.crypt', password='mypassword123')

P.S. More info here.

like image 4
Silvia Metelli Avatar answered Oct 17 '22 04:10

Silvia Metelli


I think you can do this way

key=dict(zip(np.arange(len(df)),df.id))
df.id=np.arange(len(df))
**# for the person do not have the key**

df
Out[640]:
   id
0   0
1   1
2   2


**# for the person who havde the key**

df.id=df.id.map(key.get)

df
Out[642]: 
   id
0   1
1   2
2   3
like image 1
BENY Avatar answered Oct 17 '22 05:10

BENY