Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get a hash code as integer in R?

What I want to do is implement a hash trick in R.

Code below:

library(digest)
a<-digest("key_a", algo='xxhash32')
#[1] "4da5b0f8"

This returned a hash code in a character type. Is there any way I can turn it into a integer? Or is there any other package to achieve this?

like image 629
Xin Avatar asked Dec 12 '14 11:12

Xin


People also ask

How do you int hash?

The most commonly used method for hashing integers is called modular hashing: we choose the array size M to be prime, and, for any positive integer key k, compute the remainder when dividing k by M. This function is very easy to compute (k % M, in Java), and is effective in dispersing the keys evenly between 0 and M-1.

Is hash an integer value?

For example, in Java, the hash code is a 32-bit integer. Thus the 32-bit integer Integer and 32-bit floating-point Float objects can simply use the value directly; whereas the 64-bit integer Long and 64-bit floating-point Double cannot use this method. Other types of data can also use this hashing scheme.

When a hash function generates the same integer value for two distinct objects we say we have a?

HashCode collisions Whenever two different objects have the same hash code, we call this a collision.

How are strings hashed?

The process of hashing in cryptography is to map any string of any given length, to a string with a fixed length. This smaller, fixed length string is known as a hash. To create a hash from a string, the string must be passed into a hash function.


2 Answers

That output is a hex (base 16) string. Use following function to change it to decimal. Taken from another forum post but link does not work anymore (2017).

hex_to_int = function(h) {
  xx = strsplit(tolower(h), "")[[1L]]
  pos = match(xx, c(0L:9L, letters[1L:6L]))
  sum((pos - 1L) * 16^(rev(seq_along(xx) - 1)))
}

Output

> hex_to_int(a)
[1] 1302704376

But better answer is strtoi: as @Andrie said and @Gedrox answered, base::strtoi function works in the same way.

strtoi("4da5b0f8", 16)
[1] 1302704376
like image 174
Atilla Ozgur Avatar answered Sep 30 '22 15:09

Atilla Ozgur


Since version 0.6.19, digest has a digest2int function, though there is no choice of algorithm. The algorithm used is Jenkin's one_at_a_time.

digest::digest2int("key_a")
#> [1] 1414969953
like image 25
Aurèle Avatar answered Sep 30 '22 16:09

Aurèle