Does anyone knows about any tools allowing to compare two almost exactly same websites?
Simply, I have a sandbox site and production site, and would like to find out the difference between them, to know what content to move to production site.
Thanks!
Edit:
Ok, I see I missed a critical piece of information (sorry!). The both sites are online and based on CMS (Drupal), hence I need some crawling tool which would run over the two sites, and show what pages present in sandbox, but not on production.
Thanks for everyone who answered regardless!
Use the diff command to compare text files. It can compare single files or the contents of directories. When the diff command is run on regular files, and when it compares text files in different directories, the diff command tells which lines must be changed in the files so that they match.
Use httrack to pull down a copy of the sites, and then your favourite file comparing tool to review the differences. (I prefer WinMerge, which can recursively run though two folders of files, has options to ignore whitespace differences and blank lines, and even runs well under Linux using WINE.)
P.S. You might even want to run your downloaded HTML files through HTML Tidy to normalise/pretty format them before doing the comparison.
The other way to do it would be a database comparison. You would still do the file comparison of the raw website files (not the spidered version) too though. From memory, the schema for a Drupal database isn't too hard to follow, particularly if you are just primarily interested in node content.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With