Below excerpt is from an article that explains possibility of Denial Of Service(DoS) attack because of non random hash functions used in Hash Data Structures.
[…] the condition can be leveraged by exploiting predictable collisions in the underlying hashing algorithms.
In order to verify it I went through reference implementation of Java HashMap from Oracle and indeed found a static hash functions used:
static int hash(int h) { h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4); }
Another paper on the topic tells:
A Tomcat 6.0.32 server parses a 2 MB string of colliding keys in about 44 minutes of i7 CPU time, so an attacker with about 6 kbit/s can keep one i7 core constantly busy. If the attacker has a Gigabit connection, he can keep about 100.000 i7 cores busy
How can we safeguard against this vulnerability. Moreover so with so many of softwares we use being open source (Tomcat etc.) which rely on this implementation.
Indeed, the security of the SHA-1 hash algorithm has become less secure over time due to weaknesses found in the algorithm, increased processor performance, and the advent of cloud computing. A hash function attack is an attempt to find two input strings of a hash function that produce the same hash result.
Cryptographers distinguish between three different kinds of attacks on hash functions: collision attack: try to find any two different messages m1 and m2 such that hash(m1) = hash(m2). preimage attack: Given only the hash value H, try to recover *any* M such that H = hash(M).
Each hash code will map to a specific "bucket". Each bucket contains a linked list for the case of collisions. The only way to avoid (or rather minimize) collisions is to create a hash function that creates the best possible distribution of values throughout the HashMap.
A collision attack would first start with a starting input value, and hash it. Now the attacker needs to find a collision – a different input that generates the same hash as the previous input. This would generally be done through a brute-force method (trying all possible combinations) until one was found.
Say a comment form on a blog accepts the parameters – first_name, last_name, comment – as post parameters. Internally, Tomcat stores these parameters as a HashMap.
The logical structure of this HashMap is like this -
"first_name" --> "Sripathi" "last_name" --> "Krishnan" "comment" ---> "DoS using poor Hashes"
But the physical structure is different. The keys are first converted into a hashCode, and then the hashCode is converted into an array index.
The ideal physical structure thus becomes -
0 --> "Sripathi" 1 --> "Krishnan" 2 --> "DoS using poor Hashes"
But the possible keys are infinite. So at some point, two keys will have the same hash code. This becomes a hash collision.
With collisions, the physical structure becomes :
0 --> "Sripathi", "Krishnan" 1 --> Empty 2 --> "DoS using poor hashes"
When you have hash collisions, inserting a new entry means iterating over all the elements in a single hash "bucket" sequentially just to find out if it already exists in the map. Inserting one element can approach O(n) complexity if all elements hash to the same value. Inserting n elements in this worst case makes it O(n*n) complexity.
In short : If you insert thousands of keys that have the same hashCode, the server will require a lot of CPU cycles.
In Java, "Aa" and "BB" have the same hash code.
Because of a property called "Equivalent Substrings", we can generate several other strings with the same hashcode, just by starting with these 2 strings.
First Iteration : "AAAA", "AABb", "BbAA", "BbBb" have the same hash code
Now, we have 4 strings with the same hash code. We can permute them to generate 16 strings that will have the same hash code. For example :
"AaAaAaAa", "AaAaBBBB", "AaAaAaBB", "AaAaBBAa", "BBBBAaAa", "BBBBBBBB", "BBBBAaBB", "BBBBBBAa", "AaBBAaAa", "AaBBBBBB", "AaBBAaBB", "AaBBBBAa", "BBAaAaAa", "BBAaBBBB", "BBAaAaBB", "BBAaBBAa",
All these 16 strings have the same hash code.
You can now take these 16 strings, and generate 256 strings that have the same hashcode.
In short : It is very easy to generate a large set of strings that will have the exact hash code.
Because this is just a POST request, an attacker can also use innocent browsers to attack a server. Just find a website with a cross site scripting vulnerability, embed code to make a POST request, and then use social engineering to spread the link to as many users as you can.
In general, the underlying platform cannot fix this. This is considered to be a application framework problem. In other words, Tomcat has to fix this, not Oracle/Sun.
Possible fixes include :
Restrict the number of POST parameters - Tomcat 6.0.35+ has a new parameter maxParameterCount. The default value is 10,000. The lower the better, as long as it does not break your functionality.
Restrict the size of the POST request - For the attack to work, the Payload has to be huge. The default POST allowed by Tomcat is 2MB. Reducing this to say 200KB will reduce the effectiveness of this attack. The parameter in tomcat is maxPostSize
Web Application Firewall - If you have a web application firewall, you can configure it to block requests that look suspicious. This is a reactive measure, but is nice to have in case you cannot use one of the above solutions.
FYI - Tomcat's documentation is here - http://tomcat.apache.org/tomcat-6.0-doc/config/http.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With