I'm writing malware detection based on signature scanning. As I understood, the main idea is to compare signature of scanned file with signatures in your blacklist. Here I found that signature is some kind of MD5 hash, but how can I get it from file? And are there any other types of signatures?
An MD5 hash is a digest of a file's contents. If you have a blacklist of MD5 hashes, yes, you could compare a file against them. I think this would represent a pretty simplistic and fragile way to scan for malware, but it's definitely a start. This would be fragile since comparing MD5 hashes would only recognize when two files are precisely identical. Any sort of randomness introduced into a malicious file would render this method of scanning useless.
Most languages have some standard way to generate an MD5 hash. C#, VB and VC++ can use the MD5 class, PHP has the hash function (using "MD5" as the first argument), in java you have MessageDigest, etc.
There are a number of other available hashing algorithms that could be used for this purpose. MD5 has been shown to be lacking for certain applications, and as such the SHA algorithms are becoming very standard for those applications. For this application, where you would not be expected a malicious attack would attempt to create a match, but rather the opposite of trying to prevent a match, any hash standard algorithm, including MD5 should be adequate to be reasonably assured that no false positives would be seen.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With