I just found out about the AWS Glacier service and wanted to write a small Python application to upload files via the REST API. I took a look at the required headers and stumbled upon x-amz-sha256-tree-hash
. I need to calculate the SHA-256 hash of the entire file as well as the hash of the parent of all hashes of each 1 MB chunk. This leads to the following tree:
(Image taken from here)
I already made a function that reads 1 MB chunks and a class which calculates their hashes on-the-fly but then I completely struggle:
In my application I made a class called chunk
which takes data and calculates a hash in the __init__
method as well as holds parent and children (like a regular tree).
When the user opens a file those chunks instances will be generated properly with their respective hashes (in this example that would be 7 chunk instances).
Now I have two big problems that are connected to each other:
I checked out this topic on SO but that method only works with an even children count which is not always given.
Can you help me finding a way/an algorithm/an approach on how to solve this issue?
Thanks in advance!
Paul
First calculate the number of levels, then
def proclevel(levels):
if levels > 0:
generator = proclevel(levels - 1)
temp = None
for firsthash, secondhash in generator:
if not temp: temp = hashofthem(firsthash, secondhash)
else: yield temp, hashofthem(firsthash, secondhash); temp = None
#If odd number of packets
if temp: yield temp, None
else:
temp = None
for chunk in chunks:
if not temp: temp = hash(chunk)
else: yield temp, hash(chunk); temp = None
if temp: yield temp, None
Make sure to handle None as second argument in hashofthem :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With