By default, TYPE is binary, and grep normally outputs either a one-line message saying that a binary file matches, or no message if there is no match. If TYPE is without-match, grep assumes that a binary file does not match; this is equivalent to the -I option.
As this answer notes, there are two cases where grep thinks your file is binary: if there's an encoding error detected, or if it detects some NUL bytes. Both of these sound at least conceptually simple, but it turns out that grep tries to be clever about detecting NULs.
To search for a sequence of bytes, rather than a text string, select the “binary data” search type. You can then enter the bytes into the search box as you would enter them into a hex editor. PowerGREP's regular expression support works equally well with binary files as with text files.
grep -a
It can't get simpler than that.
One way is to simply treat binary files as text anyway, with grep --text
but this may well result in binary information being sent to your terminal. That's not really a good idea if you're running a terminal that interprets the output stream (such as VT/DEC or many others).
Alternatively, you can send your file through tr
with the following command:
tr '[\000-\011\013-\037\177-\377]' '.' <test.log | grep whatever
This will change anything less than a space character (except newline) and anything greater than 126, into a .
character, leaving only the printables.
If you want every "illegal" character replaced by a different one, you can use something like the following C program, a classic standard input filter:
#include<stdio.h>
int main (void) {
int ch;
while ((ch = getchar()) != EOF) {
if ((ch == '\n') || ((ch >= ' ') && (ch <= '~'))) {
putchar (ch);
} else {
printf ("{{%02x}}", ch);
}
}
return 0;
}
This will give you {{NN}}
, where NN
is the hex code for the character. You can simply adjust the printf
for whatever style of output you want.
You can see that program in action here, where it:
pax$ printf 'Hello,\tBob\nGoodbye, Bob\n' | ./filterProg
Hello,{{09}}Bob
Goodbye, Bob
You could run the data file through cat -v
, e.g
$ cat -v tmp/test.log | grep re
line1 re ^@^M
line3 re^M
which could be then further post-processed to remove the junk; this is most analogous to your query about using tr
for the task.
-v
simply tells cat
to display non-printing characters.
You can use "strings" to extract strings from a binary file, for example
strings binary.file | grep foo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With