Is there a UNIX command on par with <pre class="prettyprint"><code>sort | uniq </code></pre> to find string set intersections or "outliers". An example application: I have a list of html templates, some of them have {% load i18n %} string inside, others don't. I want to know which files don't. edit: grep -L solves above problem. How about this: file1: <pre class="prettyprint"><code>mom dad bob </code></pre> file2: <pre class="prettyprint"><code>dad </code></pre> %intersect file1 file2 <pre class="prettyprint"><code>dad </code></pre> %left-unique file1 file2 <pre class="prettyprint"><code>mom bob </code></pre>

It appears that <code>grep -L</code> solves the real problem of the poster, but for the actual question asked, finding the intersection of two sets of strings, you might want to look into the "comm" command. For example, if <code>file1</code> and <code>file2</code> each contain a sorted list of words, one word per line, then <pre class="prettyprint"><code>$ comm -12 file1 file2 </code></pre> will produce the words common to both files. More generally, given sorted input files <code>file1</code> and <code>file2</code>, the command <pre class="prettyprint"><code>$ comm file1 file2 </code></pre> produces three columns of output <ol> <li>lines only in file1</li> <li>lines only in file2</li> <li>lines in both file1 and file2</li> </ol> You can suppress the column <code>N</code> in the output with the <code>-N</code> option. So, the command above, <code>comm -12 file1 file2</code>, suppresses columns 1 and 2, leaving only the words common to both files.

Unix command to find string set intersections or outliers?

Tags:

Is there a UNIX command on par with

sort | uniq

to find string set intersections or "outliers".

An example application: I have a list of html templates, some of them have {% load i18n %} string inside, others don't. I want to know which files don't.

edit: grep -L solves above problem.

How about this:

file1:

mom dad bob

file2:

dad

%intersect file1 file2

dad

%left-unique file1 file2

mom bob

736

asked Jun 19 '09 03:06

Evgeny

1 Answers

It appears that grep -L solves the real problem of the poster, but for the actual question asked, finding the intersection of two sets of strings, you might want to look into the "comm" command. For example, if file1 and file2 each contain a sorted list of words, one word per line, then

$ comm -12 file1 file2

will produce the words common to both files. More generally, given sorted input files file1 and file2, the command

$ comm file1 file2

produces three columns of output

lines only in file1
lines only in file2
lines in both file1 and file2

You can suppress the column N in the output with the -N option. So, the command above, comm -12 file1 file2, suppresses columns 1 and 2, leaving only the words common to both files.

149

answered Oct 23 '22 08:10

Dale Hagglund

Related questions
                            
                                Cookies - PHP vs Javascript
                            
                                Can a Parent call Child Class methods?
                            
                                Make a window topmost using a window handle
                            
                                Parsing unix time in C#
                            
                                Is it correct to ask to solve an NP-complete problem on a job interview? [closed]
                            
                                ASP.NET dynamically insert code into head
                            
                                How to pass an array into a function, and return the results with an array
                            
                                Check if the spacebar is being pressed and the mouse is moving at the same time with jQuery?
                            
                                Iterating through a variable length array
                            
                                Why doesn't the compiler at least warn on this == null
                            
                                NSArray writeToFile fails
                            
                                Would Lisp be extremely difficult for a new(ish) programmer to learn? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With