How can I find lines in one file but not the other using bash scripting?

Question

Imagine file 1:

#include "first.h"
#include "second.h"
#include "third.h"

// more code here
...

Imagine file 2:

#include "fifth.h"
#include "second.h"
#include "eigth.h"

// more code here
...

I want to get the headers that are included in file 2, but not in file 1, only those lines. So, when ran, a diff of file 1 and file 2 will produce:

#include "fifth.h"
#include "eigth.h"

I know how to do it in Perl/Python/Ruby, but I'd like to accomplish this without using a different programming language.

glenn jackman · Accepted Answer

This is a one-liner, but does not preserve the order:

comm -13 <(grep '#include' file1 | sort) <(grep '#include' file2 | sort)

If you need to preserve the order:

awk '
  !/#include/ {next} 
  FILENAME == ARGV[1] {include[$2]=1; next} 
  !($2 in include)
' file1 file2

Frank Schmitt · Answer

If it's ok to use a temp file, try this:

grep include file1.h > /tmp/x && grep -f /tmp/x -v file2.h | grep include

This

extracts all includes from file1.h and writes them to the file /tmp/x
uses this file to get all lines from file2.h that are not contained in this list
extracts all includes from the remainder of file2.h

It probably doesn't handle differences in whitespace correctly etc, though.

EDIT: to prevent false positives, use a different pattern for the last grep (thanks to jw013 for mentioning this):

grep include file1.h > /tmp/x && grep -f /tmp/x -v file2.h | grep "^#include"

tripleee · Answer

This variant requires an fgrep with the -f option. GNU grep (i.e. any Linux system, and then some) should work fine.

# Find occurrences of '#include' in file1.h
fgrep '#include' file1.h |
# Remove any identical lines from file2.h
fgrep -vxf - file2.h |
# Result is all lines not present in file1.h.  Out of those, extract #includes
fgrep '#include'

This does not require any sorting, nor any explicit temporary files. In theory, fgrep -f could use a temporary file behind the scenes, but I believe GNU fgrep doesn't.

How can I find lines in one file but not the other using bash scripting?

Tags:

bash

shell

Senthess

3 Answers

glenn jackman

Frank Schmitt

tripleee

Recent Activity

Donate For Us

How can I find lines in one file but not the other using bash scripting?

Tags:

bash

shell

Senthess

3 Answers

glenn jackman

Frank Schmitt

tripleee

Related questions

Recent Activity

Donate For Us