Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mercurial and Word or PDF documents

is it possible to use Mercurial version control to track Word or PDF files? Is there any limitation or problem?

like image 272
andrew0007 Avatar asked Mar 17 '10 11:03

andrew0007


People also ask

Which is better Word DOC or PDF?

The Word format is clearly the best choice for editing and making changes to works-in-progress while the PDF format is the preferred option for viewing and sharing documents.

What is the difference between DOCX and PDF?

docx formats are also good options for distributing documents. They're reasonably compact, and unlike PDF, files can be easily edited by recipients.

Are PDFs safer than Word documents?

Word documents do not have the same security features as PDF files, which could allow unauthorized users to quickly gain access to secure information. Organizations that want top-notch security for their resources might want to consider password protection and digital signatures to further enhance PDF security.

Which is bigger PDF or DOCX?

pdf format, the file size is over 10 times as large. There is only one small image used in the . docx file, which is a 10 kb . jpg; the document is 6 pages long with Open Sans and Calibri fonts.


3 Answers

Yes.

You will be able to do meaningful diffs for MS Word documents.

  • If you have TortoiseHg installed and you have set up a repository, right-click the file for which you want to check the diffs.

  • On the context menu, click TortoiseHg > Visual Diffs.

  • In the Visual Diffs dialog, select docdiff, instead of kdiff3.

  • Double-click the file in the Visual Diffs dialog.

MS Word will open a Compare Result Word document, which will show the differences between the current version of the document and the previous version as Tracked Changes.

like image 103
Geoffrey Avatar answered Sep 27 '22 13:09

Geoffrey


Yes, but of course you won't be able to diff in any meaningful way. The files will therefore be treated as binary during merges.

Mercurial is perfectly capable of tracking binary files:

Mercurial generally makes no assumptions about file contents. Thus, most things in Mercurial work fine with any type of file.

Mercurial stores a binary diff regardless of the file type. The problem with PDF/Word files is that a little change to them usually causes a huge difference in their binary representation on disk. .docx Documents are stored as a zipped xml, due to the zipping a single flipped bit inside the archive can cause the zip archive to look completely different.

If you don't grow your repository too large, you probably won't experience any issues using Mercurial.

like image 23
Johannes Rudolph Avatar answered Sep 26 '22 13:09

Johannes Rudolph


Beware the suggested

cmd.pdfdiff = [\path\to\diffpdf.exe]
opts.pdfdiff= -a $local $other

$local and $other have no meaning in the extdiff context. The literal strings "$local" and "$other", not the file names, will be passed to "diffpdf.exe". I found this the hard way.

cmd.pdfdiff = [\path\to\diffpdf.exe]
opts.pdfdiff= -a

will work and the two files will be passed as parameters following the "-a". c.f. https://www.mercurial-scm.org/wiki/ExtdiffExtension where it is stated:

Each custom diff commands can have two parts: a 'cmd' and an 'opts' part. The cmd.xxx option defines the name of an executable program that will be run, and opts.xxx defines a set of command-line options which will be inserted to the command between the program name and the files/directories to diff

like image 6
Donald Locker Avatar answered Sep 27 '22 13:09

Donald Locker