Create hash value for each row of data with selected columns in dataframe in python pandas

Tags:

I have asked similar question in R about creating hash value for each row of data. I know that I can use something like hashlib.md5(b'Hello World').hexdigest() to hash a string, but how about a row in a dataframe?

update 01

I have drafted my code as below:

Click to copy

for index, row in course_staff_df.iterrows():
        temp_df.loc[index,'hash'] = hashlib.md5(str(row[['cola','colb']].values)).hexdigest()

It seems not very pythonic to me, any better solution?

830

asked Sep 10 '14 03:09

lokheart

1 Answers

Or simply:

Click to copy

df.apply(lambda x: hash(tuple(x)), axis = 1)

As an example:

Click to copy

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(3,5))
print df
df.apply(lambda x: hash(tuple(x)), axis = 1)

     0         1         2         3         4
0  0.728046  0.542013  0.672425  0.374253  0.718211
1  0.875581  0.512513  0.826147  0.748880  0.835621
2  0.451142  0.178005  0.002384  0.060760  0.098650

0    5024405147753823273
1    -798936807792898628
2   -8745618293760919309

185

answered Oct 19 '22 16:10

cwharland

Related questions
                            
                                Finding matching interval(s) in pandas Intervalindex
                            
                                Python elasticsearch.helpers.scan example
                            
                                How to test Django's UpdateView?
                            
                                install python package at current directory
                            
                                Convert WindowsPath to String
                            
                                How to solve CORS problem of my Django API?
                            
                                Tensorflow TFRecord: Can't parse serialized example
                            
                                In python, is there a setdefault() equivalent for getting object attributes?
                            
                                How to create an in-memory zip file with directories without touching the disk?
                            
                                Fast n-gram calculation
                            
                                Calling rm from subprocess using wildcards does not remove the files
                            
                                Dynamic type casting in python
                            
                                break and continue in function
                            
                                Controlling alpha value on 3D scatter plot using Python and matplotlib
                            
                                What value do I use in a slicing range to include the last value in a numpy array?
                            
                                python tornado get request url
                            
                                Python: Inheritance versus Composition
                            
                                Extending python with C: Pass a list to PyArg_ParseTuple
                            
                                How does one insert a key value pair into a python list?
                            
                                sys.stdin.readline() and input(): which one is faster when reading lines of input, and why?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Create hash value for each row of data with selected columns in dataframe in python pandas

Tags:

python

pandas

hash

update 01

lokheart

People also ask

1 Answers

cwharland

Recent Activity

Donate For Us