Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count the number of occurences of binary data

Tags:

linux

grep

binary

I need to count the occurrences of the hex string 0xFF 0x84 0x03 0x07 in a binary file, without too much hassle... is there a quick way of grepping for this data from the linux command line or should I write dedicated code to do it?

like image 787
Ferenc Deak Avatar asked Mar 11 '13 10:03

Ferenc Deak


2 Answers

If your version of grep takes the -P parameter, then you can use grep -a -P, to search for an arbitrary binary string inside a binary file. This is close to what you want:

grep -a -c -P '\xFF\x84\x03\x07' myfile.bin
  • -a ensures that binary files will not be skipped

  • -c outputs the count

  • -P specifies that your pattern is a Perl-compatible regular expression (PCRE), which allows strings to contain hex characters in the above \xNN format.

Unfortunately, grep -c will only count the number of "lines" the pattern appears on - not actual occurrences.

To get the exact number of occurrences with grep, it seems you need to do:

grep -a -o -P '\xFF\x84\x03\x07' myfile.bin | wc -l

grep -o separates out each match onto its own line, and wc -l counts the lines.

Note that this all relies on the fact that your binary string contains no linebreaks.

If you do need to grep for linebreaks with this method, the simplest thing I can think of is to use tr to swap the character for another one that's not in your search term.

# set up test file (0a is newline)
xxd -r <<< '0:08 09 0a 0b 0c 0a 0b 0c' > test.bin

# grep for '\xa\xb\xc' doesn't work
grep -a -o -P '\xa\xb\xc' test.bin | wc -l

# swap newline with oct 42 and grep for that
tr '\n\042' '\042\n' < test.bin | grep -a -o -P '\042\xb\xc' | wc -l
like image 168
mwfearnley Avatar answered Sep 21 '22 10:09

mwfearnley


use hexdump like

hexdump -v -e '"0x" 1/1 "%02X" " "' <filename> | grep -oh "0xFF 0x84 0x03 0x07" |wc -w

hexdump will output binary file in the given format like 0xNN

grep will find all the occurrences of the string without considering the same ones repeated on a line

wc will give you final count

like image 22
hiteshradia Avatar answered Sep 21 '22 10:09

hiteshradia