Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can you create a cross-thread cross-process lock in python?

Tags:

python

locking

https://pypi.python.org/pypi/lockfile/0.12.2 states:

This package is deprecated. It is highly preferred that instead of using this code base that instead fasteners_ or oslo.concurrency is used instead

However, fasteners is clear that is not thread safe:

Warning: There are no guarantees regarding usage by multiple threads in a single process

And I cannot find an example of using oslo.concurrency.

There's also some suggestion that using flock can resolve this situation, but the flock manual states:

(https://www.freebsd.org/cgi/man.cgi?query=flock&sektion=2)

The flock system call applies or removes an advisory lock on the file associated with the file descriptor fd. A lock is applied by specifying an operation argument that is one of LOCK_SH or LOCK_EX with the optional addition of LOCK_NB. To unlock an existing lock operation should be LOCK_UN.

Advisory locks allow cooperating processes to perform consistent operations on files, but do not guarantee consistency (i.e., processes may still access files without using advisory locks possibly resulting in inconsistencies).

So...

Here's a python program that needs the lock and unlock functions on it implemented that will prevent the action from being implemented by more than one thread, in one instance of the process at a time.

(hint: launch with python test.py 1 & python test.py 2 & python test.py 3)

How would I fix this code so that it works correctly?

import sys
import time
import random
import threading

def lock():
  pass  # Something here?

def unlock():
  pass  # Something here?

def action(i):
  lock()
  id = threading.current_thread()
  pid = sys.argv[1]
  print("\n")
  for i in range(5):
    print("--> %s - %s - %s " % (i, id, pid))
  unlock()

class Worker(threading.Thread):
  def run(self):
    for i in range(10):
      action(i)

for _ in range(2):
  Worker().start()

The current, incorrect output looks like this:

--> 0 - <Worker(Thread-2, started 123145310715904)> - 2
--> 3 - <Worker(Thread-1, started 123145306509312)> - 1
--> 0 - <Worker(Thread-2, started 123145310715904)> - 1
--> 1 - <Worker(Thread-2, started 123145310715904)> - 2
--> 2 - <Worker(Thread-2, started 123145310715904)> - 2
--> 1 - <Worker(Thread-2, started 123145310715904)> - 1
--> 4 - <Worker(Thread-1, started 123145306509312)> - 1

and should look more like:

--> 0 - <Worker(Thread-2, started 123145310715904)> - 1
--> 1 - <Worker(Thread-2, started 123145310715904)> - 1
--> 2 - <Worker(Thread-2, started 123145310715904)> - 1
--> 3 - <Worker(Thread-2, started 123145310715904)> - 1
--> 4 - <Worker(Thread-2, started 123145310715904)> - 1
--> 0 - <Worker(Thread-2, started 123145310715904)> - 2
etc.
like image 735
Doug Avatar asked Mar 03 '16 08:03

Doug


1 Answers

Synchronizing related processes

If you can change your architecture to fork off your processes from the same parent, multiprocessing.Lock() should be enough. For example, this makes the threads run serially:

lock = multiprocessing.Lock()

def thread_proc(lock):
    with lock:
        for i in xrange(0, 10):
            print "IN THREAD", threading.current_thread()
            time.sleep(1)

threads = [threading.Thread(
    target=functools.partial(thread_proc, lock))
    for i in [1, 2]
]
for thread in threads:
    thread.start()

A potential problem might be, that multiprocessing.Lock is slightly underdocumented. I cannot give you a definite reference that multiprocessing.Lock objects are also suitable as thread lock objects.

That said: On Windows, multiprocessing.Lock is implemented using CreateSemaphore(), hence you get a cross-process, threading-safe lock. On Unix systems you get a POSIX semaphore, which has the same properties.

Portability might also be a problem, because not all *NIX systems have the POSIX semaphore (FreeBSD still has a port option to compile Python without POSIX semaphore support).

See also Is there any reason to use threading.Lock over multiprocessing.Lock? and Martijn Pieters comment and answer on Why python multiprocessing manager produce threading locks?

Synchronizing unrelated processes

However, as stated in your question, you have unrelated processes. In that case, you need a named semaphore and Python does not provide those out of the box (although it actually uses named semaphores behind the scenes).

The posix_ipc library exposes those for you. Is also seems to work on all relevant platforms.

like image 121
dhke Avatar answered Sep 21 '22 03:09

dhke