Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate two different strings with the same hashcode

I want to do some tests which require some strings with the same hash code, but not the same strings. I couldn't find any examples, so I decided to write a simple program to do it for me.

The code below generates two random strings over and over until they generate the same hash code.

    static Random r = new Random();     static void Main(string[] args)     {         string str1, str2;         do         {             str1 = GenerateString();             str2 = GenerateString();         } while (str1.GetHashCode() != str2.GetHashCode() && str1 != str2);          Console.WriteLine("{0}\n{1}", str1, str2);     }      static string GenerateString()     {         string s = "";         while (s.Length < 6)         {             s += (char)r.Next(char.MaxValue);         }         return s;     } 

This code seems to be working (theoretically), but it may take centuries to complete. So I was thinking of doing vice versa and generate two strings from one hash code.

I know it's not possible to retrieve a string from a hash code, but is it possible to generate possible strings from it?

I'm using Visual Studio 2015 Community Edition. Version: 14.0.23107.0D14REL.

.NET Framework: 4.6.00081.

like image 942
M.kazem Akhgary Avatar asked Aug 15 '15 17:08

M.kazem Akhgary


People also ask

Can two different strings have the same hashCode?

Yes, it is possible for two Strings to have the same hashcode - If you take a look at the Wikipedia article, you will see that both "FB" and "Ea" have the same hashcode. There is nothing in the method contract saying a hashCode() should be used to compare for equality, you want to use equals() for that.

Can two values have same hashCode?

It is perfectly legal for two objects to have the same hashcode. If two objects are equal (using the equals() method) then they have the same hashcode. If two objects are not equal then they cannot have the same hashcode.

What happens if two keys have same hashCode?

When two unequal objects have the same hash value, this causes a collision in the hash table, because both objects want to be in the same slot (sometimes called a bucket).

Do same strings have same hashCode?

You can instrument the java. lang. String class so its method hashCode() will always return the same number.

What happens when two strings have the same hashCode?

When two strings have the same hashcode, it’s called a hashcode collision. There are many instances where the hash code collision will happen. For example, “Aa” and “BB” have the same hash code value 2112.

How to calculate string hash code in Java?

The string hash code calculation follows the below logic. int hashcode = s [0]*31^ (n-1) + s [1]*31^ (n-2) + ... + s [n-1]; Here s [i] is the character at i th index. n is the length of the string. The hash code of an empty string is 0. Let’s have a look at some examples of String hashCode () method. public static void main (String [] args) {

How to hash a sequence of strings?

To hash a sequence of strings unambiguously, so that any two different sequences yield a different hash, you could prepend every string with its length, e.g. by using decimal representation followed by a separating character (e.g. hyphen). For the strings "abc" and "de" and "" (empty string) and "f" this would look so:

How do I concatenate two strings with the same hash?

Get the pair ("; system ('bin/sh'); #", "1") signed as harmless, then present ("", "; system ('bin/sh'); #1") which has the same hash and therefore the same signature. One way to unambiguously denote the concatenation of strings is to encode strings and add delimiters (quotes), e.g. replace all \ and " by \ and " and surround each string by "…".


Video Answer


2 Answers

Finding two strings by repeatedly comparing random strings will take practically forever. Instead generate strings and store them in an a dictionary by hashcode. Then look up each. Match found pretty quickly.

MATCH FOUND!! xqzrbn and krumld hash code 80425224

void Main() {      var lookup = new Dictionary<int,string>();      while(true) {         var s = RandomString();                 var h = s.GetHashCode();         string s2;         if (lookup.TryGetValue(h, out s2) && s2 != s) {             Console.WriteLine("MATCH FOUND!! {0} and {1} hash code {2}",                 lookup[h],                 s,                 h);             break;         }         lookup[h] = s;          if (lookup.Count % 1000 == 0) {             Console.WriteLine(lookup.Count);         }     } }  static Random r = new Random();  // Define other methods and classes here static string RandomString() {      var s = ((char)r.Next((int)'a',((int)'z')+1)).ToString() +             ((char)r.Next((int)'a',((int)'z')+1)).ToString() +             ((char)r.Next((int)'a',((int)'z')+1)).ToString() +             ((char)r.Next((int)'a',((int)'z')+1)).ToString() +             ((char)r.Next((int)'a',((int)'z')+1)).ToString() +             ((char)r.Next((int)'a',((int)'z')+1)).ToString();      return s; } 
like image 111
Samuel Neff Avatar answered Sep 17 '22 12:09

Samuel Neff


Take advantage of the Birthday Paradox. Instead of only testing two strings directly, test all strings you have seen before.

using System; using System.Collections.Generic;  namespace ConsoleApplication2 {     class Program     {         static void Main(string[] args)         {             var words = new Dictionary<int, string>();              int i = 0;             string teststring;             while (true)             {                 i++;                 teststring = i.ToString();                 try                 {                     words.Add(teststring.GetHashCode(), teststring);                 }                 catch (Exception)                 {                     break;                 }             }              var collisionHash = teststring.GetHashCode();             Console.WriteLine("\"{0}\" and \"{1}\" have the same hash code {2}", words[collisionHash], teststring, collisionHash);             Console.ReadLine();         }     } } 

For my machine it produces the output

"699391" and "1241308" have the same hash code -1612916492

almost instantly.

Due to how strings are hashed in .NET you may not get the exact same output as me, but it should be just as fast.

like image 26
Scott Chamberlain Avatar answered Sep 17 '22 12:09

Scott Chamberlain