When I open a SQLite database file there is a lot of readable text in the beginning of the file - how big is the chance that a SQLite file is filtered wrongly away due the -B
file test?
#!/usr/bin/env perl
use warnings;
use strict;
use 5.10.1;
use File::Find;
my $dir = shift;
my $databases;
find( {
wanted => sub {
my $file = $File::Find::name;
return if not -B $file;
return if not -s $file;
return if not -r $file;
say $file;
open my $fh, '<', $file or die "$file: $!";
my $firstline = readline( $fh ) // '';
close $fh or die $!;
push @$databases, $file if $firstline =~ /\ASQLite\sformat/;
},
no_chdir => 1,
},
$dir );
say scalar @$databases;
If you found the .exe you want to scan in the Windows task manager and you're not sure of its location, then right click it and choose “open file location”. The file should then automatically be highlighted. Now right click the file once and scan it. If it's marked as safe, then it's probably safe to be on your PC.
The perlfunc man page has the following to say about -T
and -B
:
The -T and -B switches work as follows. The first block or so of the file is
examined for odd characters such as strange control codes or characters with
the high bit set. If too many strange characters (>30%) are found, it's a -B
file; otherwise it's a -T file. Also, any file containing a zero byte in the
first block is considered a binary file.
Of course you could now do a statistic analysis of a number of sqlite files, parse their "first block or so" for "odd characters", calculate the probability of their occurrence, and that would give you an idea of how likely it is that -B
fails for sqlite files.
However, you could also go the easy route. Can it fail? Yes, it's a heuristic. And a bad one at that. So don't use it.
File type recognition on Unix is usually done by evaluating the file's content. And yes, there are people who've done all the work for you already: it's called libmagic
(the thingy that yields the file
command line tool). You can use it from Perl with e.g. File::MMagic.
Well, all files are technically a collection of bytes, and thus binary. Beyond that, there is no accepted definition of binary, so it's impossible to evaluate -B
's reliability unless you care to posit a definition by which it is to be evaluated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With