Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I prevent duplicate content using md5?

Tags:

hash

md5

I would like to prevent duplicate content. I do not want to keep a copies of content, so I decided to keep just the md5 signatures.

I read that md5 collisions do happen, different content could give in the same md5 signature.

Do you think md5 is enough?

Should I use md5 and sh1 together?

like image 935
Alex L Avatar asked Dec 10 '22 20:12

Alex L


2 Answers

People have been able to deliberately produce MD5 collisions under contrived circumstances, but for preventing duplicate content (in the absence of malicious users) it's more than adequate.

Having said that, if you can use SHA-1 (or SHA-2) you should - you'll be fractionally but measurably safer from collisions.

like image 69
RichieHindle Avatar answered Jan 29 '23 02:01

RichieHindle


MD5 should be fine, collisions are very rare, but if you're really worried, you can use sha-1 as well.

Though I guess the signatures really aren't that large, so if you have the spare processing cycles and the disk space, you could do both. But if space or speed is limited, I'd just go with one.

like image 38
samoz Avatar answered Jan 29 '23 03:01

samoz