I am currently playing around with parsing diff files, and have yet to come across a solid documentation on diff files.
I am especially interested in specifications. E.g. I don't really understand the lines that look like this (at the beginning of each changed code block):
@@ -296,7 +296,8 @@
I know they have to do with line numbers, and how much lines have changed, but I wasn't really able to figure out the details so far.
What is the syntax of the output diff
files (at least, the main parts)?
Check out the documentation for GNU diffutils. There you'll find this section:
Next come one or more hunks of differences; each hunk shows one area where the files differ. Unified format hunks look like this:
@@ from-file-line-numbers to-file-line-numbers @@ line-from-either-file line-from-either-file...
If a hunk contains just one line, only its start line number appears. Otherwise its line numbers look like ‘start,count’. An empty hunk is considered to start at the line that follows the hunk.
If a hunk and its context contain two or more lines, its line numbers look like ‘start,count’. Otherwise only its end line number appears. An empty hunk is considered to end at the line that precedes the hunk.
The lines common to both files begin with a space character. The lines that actually differ between the two files have one of the following indicator characters in the left print column:
‘+’ A line was added here to the first file.
‘-’ A line was removed here from the first file.
The Wikipedia page on the diff
utility describes the format pretty well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With