Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hash each value in a pandas data frame

In python, I am trying to find the quickest to hash each value in a pandas data frame.

I know any string can be hashed using:

hash('a string')

But how do I apply this function on each element of a pandas data frame?

This may be a very simple thing to do, but I have just started using python.

like image 748
user3664020 Avatar asked May 09 '15 19:05

user3664020


People also ask

How to get hash value of reach row in pandas Dataframe?

Show activity on this post. As of Pandas 0.20.1, you can use the little known (and poorly documented) hash_pandas_object ( source code) which was recently made public in pandas.util. It returns one hash value for reach row of the dataframe (and works on series etc. too)

How to loop over a Dataframe in pandas?

DataFrame Looping (iteration) with a for statement. You can loop over a pandas dataframe, for each column row by row. Below pandas.

How to get the column name of a Dataframe in pandas?

Using a DataFrame as an example. You can use the iteritems () method to use the column name (column name) and the column data (pandas. Series) tuple (column name, Series) can be obtained.

Can I have sensitive data in one column of pandas Dataframe?

Y our dataset can commonly contain sensitive data in one or more columns. For example, user IDs, patient numbers, or license numbers. Here I share how to create a new column containing hashed strings based on the clear-text strings of the other column of Pandas DataFrame.


1 Answers

Pandas also has a function to apply a hash function on an array or column:

import pandas as pd

df = pd.DataFrame({'a':['asds','asdds','asdsadsdas']})
df["hash"] = pd.util.hash_array(df["a"].to_numpy())

like image 53
bert wassink Avatar answered Sep 20 '22 15:09

bert wassink