Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should time.perf_counter() be consistent across processes in Python on Windows?

UPDATE: A fix for this bug has been committed and will debut in Python 3.10, expected to be released Oct 2021. See the bug report for details.


The documentation for time.perf_counter() indicates that it is system-wide

time.perf_counter() → float

Return the value (in fractional seconds) of a performance counter, i.e. a clock with the highest available resolution to measure a short duration. It does include time elapsed during sleep and is system-wide. The reference point of the returned value is undefined, so that only the difference between the results of consecutive calls is valid.

Am I incorrect in interpreting system-wide to include consistency across processes?

As shown below, it appears to be consistent on Linux, but not on Windows. In addition, Windows behavior with Python 3.6 is significantly different to 3.7.

I'd appreciate it if anyone can point out documentation or bug reports covering this behavior.

Test case

import concurrent.futures
import time

def worker():
    return time.perf_counter()

if __name__ == '__main__':
    pool = concurrent.futures.ProcessPoolExecutor()
    futures = []
    for i in range(3):
        print('Submitting worker {:d} at time.perf_counter() == {:.3f}'.format(i, time.perf_counter()))
        futures.append(pool.submit(worker))
        time.sleep(1)

    for i, f in enumerate(futures):
        print('Worker {:d} started at time.perf_counter() == {:.3f}'.format(i, f.result()))

Results on Windows 7

C:\...>Python36\python.exe -VV
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)]

C:\...>Python36\python.exe perf_counter_across_processes.py
Submitting worker 0 at time.perf_counter() == 0.000
Submitting worker 1 at time.perf_counter() == 1.169
Submitting worker 2 at time.perf_counter() == 2.170
Worker 0 started at time.perf_counter() == 0.000
Worker 1 started at time.perf_counter() == 0.533
Worker 2 started at time.perf_counter() == 0.000

C:\...>Python37\python.exe -VV
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)]

C:\...>Python37\python.exe perf_counter_across_processes.py
Submitting worker 0 at time.perf_counter() == 0.376
Submitting worker 1 at time.perf_counter() == 1.527
Submitting worker 2 at time.perf_counter() == 2.529
Worker 0 started at time.perf_counter() == 0.380
Worker 1 started at time.perf_counter() == 0.956
Worker 2 started at time.perf_counter() == 1.963

I've omitted further results on Windows for brevity, but the same behavior was observed on Windows 8.1. In addition, Python 3.6.7 behaved the same as 3.6.8, while Python 3.7.1 behaved the same as 3.7.3.

Results on Ubuntu 18.04.1 LTS

$ python3 -VV
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0]

$ python3 perf_counter_across_processes.py 
Submitting worker 0 at time.perf_counter() == 2075.896
Submitting worker 1 at time.perf_counter() == 2076.900
Submitting worker 2 at time.perf_counter() == 2077.903
Worker 0 started at time.perf_counter() == 2075.900
Worker 1 started at time.perf_counter() == 2076.902
Worker 2 started at time.perf_counter() == 2077.905

$ python3.7 -VV
Python 3.7.1 (default, Oct 22 2018, 11:21:55) 
[GCC 8.2.0]

$ python3.7 perf_counter_across_processes.py 
Submitting worker 0 at time.perf_counter() == 1692.514
Submitting worker 1 at time.perf_counter() == 1693.518
Submitting worker 2 at time.perf_counter() == 1694.520
Worker 0 started at time.perf_counter() == 1692.517
Worker 1 started at time.perf_counter() == 1693.519
Worker 2 started at time.perf_counter() == 1694.522
like image 830
echo Avatar asked Jun 07 '19 23:06

echo


People also ask

What is time Perf_counter () python?

The perf_counter() function always returns the float value of time in seconds. Return the value (in fractional seconds) of a performance counter, i.e. a clock with the highest available resolution to measure a short duration. It does include time elapsed during sleep and is system-wide.

What is difference between time time() and time perf_ counter()?

time. process_time() has a tick rate that is about 4.67 times faster than time. perf_counter() , which has a tick rate that is about 33,491 times faster than time.


1 Answers

In Windows, time.perf_counter is based on WINAPI QueryPerformanceCounter. This counter is system wide. For more information, see acquiring high-resolution time stamps.

That said, perf_counter in Windows returns a value that's relative to the process startup value. Thus it is not a system-wide value. It does this in order to reduce precision loss when converting the integer value to a float, which has only 15 decimal digits of precision. Using a relative value is uncalled for in most cases, which only need microsecond precision. There should be an optional parameter to query the true QPC counter value, especially for perf_counter_ns in 3.7+.

Regarding the different initial values returned by perf_counter in 3.6 vs 3.7, the implementation has changed a bit over time. In 3.6.8, perf_counter is implemented in Modules/timemodule.c, so the initial value is stored when the time module is first imported and initialized, which is why you see the first result as 0.000 seconds. In more recent releases it's implemented separately in Python's C API. For example, see "Python/pytime.c" in the latest 3.8 beta release. In this case, by the time Python code calls time.perf_counter(), the counter has incremented well past the startup value.

Here's an alternative implementation based on ctypes that uses the system-wide QPC value instead of a relative value.

import sys

if sys.platform != 'win32':
    from time import perf_counter
    try:
        from time import perf_counter_ns
    except ImportError:
        def perf_counter_ns():
            """perf_counter_ns() -> int

            Performance counter for benchmarking as nanoseconds.
            """
            return int(perf_counter() * 10**9)
else:
    import ctypes
    from ctypes import wintypes

    kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)

    kernel32.QueryPerformanceFrequency.argtypes = (
        wintypes.PLARGE_INTEGER,) # lpFrequency

    kernel32.QueryPerformanceCounter.argtypes = (
        wintypes.PLARGE_INTEGER,) # lpPerformanceCount

    _qpc_frequency = wintypes.LARGE_INTEGER()
    if not kernel32.QueryPerformanceFrequency(ctypes.byref(_qpc_frequency)):
        raise ctypes.WinError(ctypes.get_last_error())
    _qpc_frequency = _qpc_frequency.value

    def perf_counter_ns():
        """perf_counter_ns() -> int

        Performance counter for benchmarking as nanoseconds.
        """
        count = wintypes.LARGE_INTEGER()
        if not kernel32.QueryPerformanceCounter(ctypes.byref(count)):
            raise ctypes.WinError(ctypes.get_last_error())
        return (count.value * 10**9) // _qpc_frequency

    def perf_counter():
        """perf_counter() -> float

        Performance counter for benchmarking.
        """
        count = wintypes.LARGE_INTEGER()
        if not kernel32.QueryPerformanceCounter(ctypes.byref(count)):
            raise ctypes.WinError(ctypes.get_last_error())
        return count.value / _qpc_frequency

QPC typically has a resolution of 0.1 microseconds. A float in CPython has 15 decimal digits of precision. So this implementation of perf_counter is within the QPC resolution for an uptime of about 3 years.

like image 181
Eryk Sun Avatar answered Sep 21 '22 01:09

Eryk Sun