Try to read a binary file (firmware) with a sequences like
\x01\x00\x00\x00\x03\x00\x00\x00\x02\x00\x00\x00\x04\x00\x00\x00
Little endian integer 1,3,2,4
Attempt:
with open("firm.bin", 'rb') as f:
s = f.read()
N = 16
allowed = set(range(4))
for val in allowed:
val = bytes(val)+b'\x00\x00\x00'
for index, b in enumerate(s):
print(b)
i = b.hex()
b= b'\x00\x00\x00'+bytes(bytes.fromhex(f'{i:x}'))
if b in allowed and set(s[index:index + N]) == allowed:
print(f'Found sequence {s[index:index + N]} at offset {index}')
Above does not seem to work with error:
ValueError: Unknown format code 'x' for object of type 'str'
Why?
Problem I am trying to solve:
How can I find in binary file sequences like this being 16 ints little endian with values from 0 to 15 i.e
[0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15]
Update 1:
Tried proposed answer, but no results, where it should:
import numpy as np
import sys
# Synthesize firmware with 100 integers, all equal to 1
#firmware = np.full(100, 1, dtype=np.uint32)
#firmware = np.fromfile('firm.ori', dtype='uint32')
a1D = np.array([1, 2, 3, 4, 6, 5, 7, 8, 10, 9, 11, 13, 12, 14, 15, 0],dtype='uint32')
print(a1D)
r = np.convolve(a1D, [1]*16, mode='same')[8:-8]
np.set_printoptions(threshold=sys.maxsize)
print(r)
r = np.where(r < (16*15))
print(r)
print(a1D[r])
Ideally it should say offset 0, but values would be also fine i.e to print
[ 1 2 3 4 6 5 7 8 10 9 11 13 12 14 15 0]
Now it outputs:
[ 1 2 3 4 6 5 7 8 10 9 11 13 12 14 15 0]
[]
(array([], dtype=int64),)
[]
You refer to the values in the firmware as 32-bit integers so I've assumed that the file can be converted to integers. I've used the Python struct lib to do this.
I've also understood that you want to find a sequence of 16 unique integers in the range 0 to 15.
My test below iterated over the integers in the firmware file, looking ahead each time and converting that list of 16 integers to a set to check the length was still 16.
I then iterated over the set to check all values where below 16.
Here is my test I did:
from secrets import token_bytes
import struct
# Create test data
firmware_ints = 200_000
int_len = 4
data = token_bytes(firmware_ints * int_len)
to_find = struct.pack('<16L', *range(16))
print(f"To find [{len(to_find)}]: {to_find}\n")
hide_idx = 20 * int_len * -1 # find 20 ints from the end
data = b''.join([data[:hide_idx], to_find, data[hide_idx:]])
# End of creating test data
search_max = 16
search_len = 16
# Convert firmware to integers
words = [x[0] for x in struct.iter_unpack('<L', data)]
# Iterate through to find sequence
for idx in range(len(words) - search_len):
this_seq = words[idx:idx + search_len]
if len(set(this_seq)) == search_len:
if all([x < search_max for x in this_seq]):
print(f'Found sequence {this_seq} at offset {idx}')
which gave the output of:
Hidden bytes [64]: b'\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c\x00\x00\x00\r\x00\x00\x00\x0e\x00\x00\x00\x0f\x00\x00\x00'
Found sequence [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] at offset 199980
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With