Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the output of my Python program look this (very odd) way?

I've worked with Python for a while, but I've never really done any concurrency in it before today. I stumbled upon this blog post and decided to make a similar (but simpler) example:

import os
import threading
import Queue

class Worker(threading.Thread):
    def __init__(self, queue, num):
        threading.Thread.__init__(self)
        self.queue = queue
        self.num = num

    def run(self):
        while True:
            text = self.queue.get()
            #print "{} :: {}".format(self.num, text)
            print "%s :: %s" % (self.num, text)
            self.queue.task_done()

nonsense = ["BLUBTOR", "more nonsense", "cookies taste good", "what is?!"]
queue = Queue.Queue()

for i in xrange(4):
    # Give the worker the queue and also its "number"
    t = Worker(queue, i)
    t.setDaemon(True)
    t.start()

for gibberish in nonsense:
    queue.put(gibberish)

queue.join()

It seems to work fine, but there seems to be some problem with the prints which I cannot figure out. A couple of test runs:

chris@DPC3:~/code/pythonthreading$ python owntest.py 
0 :: BLUBTOR
 1 :: more nonsense
3 :: cookies taste good
 2 :: what is?!
chris@DPC3:~/code/pythonthreading$ python owntest.py 
0 :: BLUBTOR
2 :: more nonsense
3 :: cookies taste good0 :: what is?!

chris@DPC3:~/code/pythonthreading$ python owntest.py 
2 :: BLUBTOR
 3 :: more nonsense1 :: cookies taste good

 2 :: what is?!
chris@DPC3:~/code/pythonthreading$

Why is the output formatted this oddly?

like image 858
cgt Avatar asked Dec 15 '22 15:12

cgt


2 Answers

print is not atomic.

The following line:

        print "%s :: %s" % (self.num, text)

get translated into the following bytecodes:

         24 LOAD_CONST               1 ('%s :: %s')
         27 LOAD_FAST                0 (self)
         30 LOAD_ATTR                3 (num)
         33 LOAD_FAST                1 (text)
         36 BUILD_TUPLE              2
         39 BINARY_MODULO       
         40 PRINT_ITEM          
         41 PRINT_NEWLINE       

As you can see, there are two printing bytecodes there (PRINT_ITEM and PRINT_NEWLINE). If the thread gets preempted between the two, you would see what you are seeing.

I agree with others that sys.stdout.write() is a safer bet for this use case since:

  1. it forces you to format the entire string before you write it (with print you can accidentally use print a, b, c, and end up with three separate writes instead of one);
  2. it sidesteps the issues of softspace and automatic newline, both of which can interact with print statements in other parts of the program.
like image 149
NPE Avatar answered Dec 21 '22 10:12

NPE


Printing is not threadsafe.

While some characters are being copied to the stdout stream by one thread, another thread is scheduled that is also printing and it gets to copy characters to stdout stream.

The result is that your stdout doesn't contain discreet results of print calls but mixed outputs from different threads, all jumbled together.

The work-around is to use sys.stdout.write() instead; that is an atomic (threadsafe) operation. Make sure you include an explicit \n newline.

like image 23
Martijn Pieters Avatar answered Dec 21 '22 09:12

Martijn Pieters