I have 2 nix files. All of the data is on one single line in each file. Each value is separated by a null character. Some off the values in the data match.
How would I parse this data into a new file listing only the matching values ?
I figure I could use sed to change the null characters into newlines ? From there on I'm not real sure...
Any ideas ?
Use tr
, sort
and comm
:
Convert nulls into new lines, and sort the result:
$ tr '\000' '\n' < file1 | sort > file1.txt
$ tr '\000' '\n' < file2 | sort > file2.txt
then use comm
to get the lines that are common to both file:
$ comm -1 -2 file1.txt file2.txt
<lines shown here are the common lines between file1.txt and file2.txt>
If there are no duplicate values within file1 or file2, you can do this:
( tr '\0' '\n' < file1; tr '\0' '\n' < file2 ) | sort | uniq -c | egrep -v '^ +1'
This will count all of the duplicate values between the two files.
If the order of the fields is important, you can do this:
comm -1 -2 <(tr '\0' '\n' < file1) <(tr '\0' '\n' < file2)
This approach is not portable, it requires the 'process substitution' feature of Bash.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With