I am trying to write a simple proof-of-work nonce-finder in python.
def proof_of_work(b, nBytes):
nonce = 0
# while the first nBytes of hash(b + nonce) are not 0
while sha256(b + uint2bytes(nonce))[:nBytes] != bytes(nBytes):
nonce = nonce + 1
return nonce
Now I am trying to do this multiprocessed, so it can use all CPU cores and find the nonce faster. My idea is to use multiprocessing.Pool
and execute the function proof_of_work multiple times, passing two params num_of_cpus_running
and this_cpu_id
like so:
def proof_of_work(b, nBytes, num_of_cpus_running, this_cpu_id):
nonce = this_cpu_id
while sha256(b + uint2bytes(nonce))[:nBytes] != bytes(nBytes):
nonce = nonce + num_of_cpus_running
return nonce
So, if there are 4 cores, every one will calculate nonces like this:
core 0: 0, 4, 8, 16, 32 ...
core 1: 1, 5, 9, 17, 33 ...
core 2: 2, 6, 10, 18, 34 ...
core 3: 3, 7, 15, 31, 38 ...
So, I have to rewrite proof_of_work
so when anyone of the processes finds a nonce, everyone else stops looking for nonces, taking into account that the found nonce has to be the lowest value possible for which the required bytes are 0. If a CPU speeds up for some reason, and returns a valid nonce higher than the lowest valid nonce, then the proof of work is not valid.
The only thing I don't know how to do is the part in which a process A will only stop if process B found a nonce that is lower than the nonce that is being calculated right now by process A. If its higher, A keeps calculating (just in case) until it arrives to the nonce provided by B.
I hope I explained myself correctly. Also, if there is a faster implementation of anything I wrote, I would love to hear about it. Thank you very much!
A process can be killed by calling the Process. terminate() function. The call will only terminate the target process, not child processes. The method is called on the multiprocessing.
Daemonize and scale your Python apps In Unix speak, a Daemon is a long-running background process that can perform virtually anything, from executing requests for services to performing any, usually long-running, arbitrary tasks for day-to-day activities on UNIX systems.
In this example, at first we import the Process class then initiate Process object with the display() function. Then process is started with start() method and then complete the process with the join() method. We can also pass arguments to the function using args keyword.
Python multiprocessing join The join method blocks the execution of the main process until the process whose join method is called terminates. Without the join method, the main process won't wait until the process gets terminated.
One easy option is to use micro-batches and check if an answer was found. Too small batches incur overhead from starting parallel jobs, too large size causes other processes to do extra work while one process already found an answer. Each batch should take 1 - 10 seconds to be efficient.
Sample code:
from multiprocessing import Pool
from hashlib import sha256
from time import time
def find_solution(args):
salt, nBytes, nonce_range = args
target = '0' * nBytes
for nonce in xrange(nonce_range[0], nonce_range[1]):
result = sha256(salt + str(nonce)).hexdigest()
#print('%s %s vs %s' % (result, result[:nBytes], target)); sleep(0.1)
if result[:nBytes] == target:
return (nonce, result)
return None
def proof_of_work(salt, nBytes):
n_processes = 8
batch_size = int(2.5e5)
pool = Pool(n_processes)
nonce = 0
while True:
nonce_ranges = [
(nonce + i * batch_size, nonce + (i+1) * batch_size)
for i in range(n_processes)
]
params = [
(salt, nBytes, nonce_range) for nonce_range in nonce_ranges
]
# Single-process search:
#solutions = map(find_solution, params)
# Multi-process search:
solutions = pool.map(find_solution, params)
print('Searched %d to %d' % (nonce_ranges[0][0], nonce_ranges[-1][1]-1))
# Find non-None results
solutions = filter(None, solutions)
if solutions:
return solutions
nonce += n_processes * batch_size
if __name__ == '__main__':
start = time()
solutions = proof_of_work('abc', 6)
print('\n'.join('%d => %s' % s for s in solutions))
print('Solution found in %.3f seconds' % (time() - start))
Output (a laptop with Core i7):
Searched 0 to 1999999
Searched 2000000 to 3999999
Searched 4000000 to 5999999
Searched 6000000 to 7999999
Searched 8000000 to 9999999
Searched 10000000 to 11999999
Searched 12000000 to 13999999
Searched 14000000 to 15999999
Searched 16000000 to 17999999
Searched 18000000 to 19999999
Searched 20000000 to 21999999
Searched 22000000 to 23999999
Searched 24000000 to 25999999
Searched 26000000 to 27999999
Searched 28000000 to 29999999
Searched 30000000 to 31999999
Searched 32000000 to 33999999
Searched 34000000 to 35999999
Searched 36000000 to 37999999
37196346 => 000000f4c9aee9d427dc94316fd49192a07f1aeca52f6b7c3bb76be10c5adf4d
Solution found in 20.536 seconds
With single core it took 76.468 seconds. Anyway this isn't by far the most efficient way to find a solution but it works. For example if the salt
is long then the SHA-256
state could be pre-computed after the salt has been absorbed and continue brute-force search from there. Also byte array could be more efficient than the hexdigest()
.
A general method to do this is to:
This way you don't have to check with the manager each iteration (which would slow down everything), or do nasty things like stopping a thread mid-session. Needless to say, the manager needs to be thread safe.
This fits perfectly with your model, as you still need the results of the other workers, even if a result has been found.
Note that in your model, it could be that a thread may go out of sync with the other threads, lagging behind. You don't want to do another million calculations once a result is found. I'm just reiterating this from the question because I think the model is wrong. You should fix the model instead of fixing the implementation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With