I'm generating binary data files that are simply a series of records concatenated together. Each record consists of a (binary) header followed by binary data. Within the binary header is an ascii string 80 characters long. Somewhere along the way, my process of writing the files got a little messed up and I'm trying to debug this problem by inspecting how long each record actually is.
This seems extremely related, but I don't understand perl, so I haven't been able to get the accepted answer there to work. The other answer points to bgrep
which I've compiled, but it wants me to feed it a hex string and I'd rather just have a tool where I can give it the ascii string and it will find it in the binary data, print the string and the byte offset where it was found.
In other words, I'm looking for some tool which acts like this:
tool foobar filename
or
tool foobar < filename
and its output is something like this:
foobar:10 foobar:410 foobar:810 foobar:1210 ...
e.g. the string which matched and a byte offset in the file where the match started. In this example case, I can infer that each record is 400 bytes long.
Other constraints:
If type is ' text ', grep processes binary data as if it were text; this is equivalent to the -a option. When type is ' binary ', grep may treat non-text bytes as line terminators even without the -z ( --null-data ) option. This means choosing ' binary ' versus ' text ' can affect whether a pattern matches a file.
The offset indicates the number of bytes forward or backward from the base. For a binary file, the positionfile() function always positions the file to the beginning of a record. If you specify the offset clause, 4GL adjusts the file position to the beginning of the record containing the specified byte number.
As this answer notes, there are two cases where grep thinks your file is binary: if there's an encoding error detected, or if it detects some NUL bytes. Both of these sound at least conceptually simple, but it turns out that grep tries to be clever about detecting NULs.
grep --byte-offset --only-matching --text foobar filename
The --byte-offset
option prints the offset of each matching line.
The --only-matching
option makes it print offset for each matching instance instead of each matching line.
The --text
option makes grep treat the binary file as a text file.
You can shorten it to:
grep -oba foobar filename
It works in the GNU version of grep
, which comes with linux by default. It won't work in BSD grep (which comes with Mac by default).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With