Say I have 2 files - file1.csv and file2.csv. I need to compare column 2 of both the files (string values) and print out the rows in file2.csv for the values of its column 3 that are not present in the column 3 of file1.csv.
I've tried using the following awk command:
awk -F'\t''NR==FNR{c[$3]++;next};c[$3] == 0' file1.csv file2.csv
This however just gives me all of file2.csv. There are only 2 extra rows in file2.csv that are not present in file1.csv.
Could someone tell me what it is I'm doing wrong?
Snippet of file1.csv (Columns are numbered from 0)
ANR 26545 CallExpression mutex_unlock ( & mmc_test_lock )
ANR 26546 Callee mutex_unlock
ANR 26547 Identifier mutex_unlock
ANR 26548 ArgumentList & mmc_test_lock
ANR 26549 Argument & mmc_test_lock
ANR 26550 UnaryOperationExpression & mmc_test_lock
ANR 26551 UnaryOperator &
ANR 26552 Identifier mmc_test_lock
ANR 26553 ExpressionStatement "__free_pages ( test -> highmem , BUFFER_ORDER )"
ANR 26554 CallExpression "__free_pages ( test -> highmem , BUFFER_ORDER )"
ANR 26555 Callee __free_pages
ANR 26556 Identifier __free_pages
ANR 26557 ArgumentList test -> highmem
ANR 26558 Argument test -> highmem
ANR 26559 PtrMemberAccess test -> highmem
ANR 26560 Identifier test
ANR 26561 Identifier highmem
ANR 26562 Argument BUFFER_ORDER
ANR 26563 Identifier BUFFER_ORDER
Snippet of file2.csv
ANR 12910 CallExpression mutex_unlock ( & mmc_test_lock )
ANR 12911 Callee mutex_unlock
ANR 12912 Identifier mutex_unlock
ANR 12913 ArgumentList & mmc_test_lock
ANR 12914 Argument & mmc_test_lock
ANR 12915 UnaryOperationExpression & mmc_test_lock
ANR 12916 UnaryOperator &
ANR 12917 Identifier mmc_test_lock
ANR 12918 IfStatement if ( test -> highmem )
ANR 12919 Condition test -> highmem
ANR 12920 PtrMemberAccess test -> highmem
ANR 12921 Identifier test
ANR 12922 Identifier highmem
ANR 12923 ExpressionStatement "__free_pages ( test -> highmem , BUFFER_ORDER )"
ANR 12924 CallExpression "__free_pages ( test -> highmem , BUFFER_ORDER )"
ANR 12925 Callee __free_pages
ANR 12926 Identifier __free_pages
ANR 12927 ArgumentList test -> highmem
ANR 12928 Argument test -> highmem
ANR 12929 PtrMemberAccess test -> highmem
ANR 12930 Identifier test
ANR 12931 Identifier highmem
ANR 12932 Argument BUFFER_ORDER
ANR 12933 Identifier BUFFER_ORDER
Expected output:
ANR 12918 IfStatement if ( test -> highmem )
ANR 12919 Condition test -> highmem
You need to change your awk command to this:
awk -F'\t' 'NR==FNR {seen[$2]; next} !($2 in seen)' file1.csv file2.csv
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With