Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is ctypes so slow to convert a Python list to a C array?

The bottleneck of my code is currently a conversion from a Python list to a C array using ctypes, as described in this question.

A small experiment shows that it is indeed very slow, in comparison of other Python instructions:

import timeit
setup="from array import array; import ctypes; t = [i for i in range(1000000)];"
print(timeit.timeit(stmt='(ctypes.c_uint32 * len(t))(*t)',setup=setup,number=10))
print(timeit.timeit(stmt='array("I",t)',setup=setup,number=10))
print(timeit.timeit(stmt='set(t)',setup=setup,number=10))

Gives:

1.790962941000089
0.0911122129996329
0.3200237319997541

I obtained these results with CPython 3.4.2. I get similar times with CPython 2.7.9 and Pypy 2.4.0.

I tried runing the above code with perf, commenting the timeit instructions to run only one at a time. I get these results:

ctypes

 Performance counter stats for 'python3 perf.py':

       1807,891637      task-clock (msec)         #    1,000 CPUs utilized          
                 8      context-switches          #    0,004 K/sec                  
                 0      cpu-migrations            #    0,000 K/sec                  
            59 523      page-faults               #    0,033 M/sec                  
     5 755 704 178      cycles                    #    3,184 GHz                    
    13 552 506 138      instructions              #    2,35  insn per cycle         
     3 217 289 822      branches                  # 1779,581 M/sec                  
           748 614      branch-misses             #    0,02% of all branches        

       1,808349671 seconds time elapsed

array

 Performance counter stats for 'python3 perf.py':

        144,678718      task-clock (msec)         #    0,998 CPUs utilized          
                 0      context-switches          #    0,000 K/sec                  
                 0      cpu-migrations            #    0,000 K/sec                  
            12 913      page-faults               #    0,089 M/sec                  
       458 284 661      cycles                    #    3,168 GHz                    
     1 253 747 066      instructions              #    2,74  insn per cycle         
       325 528 639      branches                  # 2250,011 M/sec                  
           708 280      branch-misses             #    0,22% of all branches        

       0,144966969 seconds time elapsed

set

 Performance counter stats for 'python3 perf.py':

        369,786395      task-clock (msec)         #    0,999 CPUs utilized          
                 0      context-switches          #    0,000 K/sec                  
                 0      cpu-migrations            #    0,000 K/sec                  
           108 584      page-faults               #    0,294 M/sec                  
     1 175 946 161      cycles                    #    3,180 GHz                    
     2 086 554 968      instructions              #    1,77  insn per cycle         
       422 531 402      branches                  # 1142,636 M/sec                  
           768 338      branch-misses             #    0,18% of all branches        

       0,370103043 seconds time elapsed

The code with ctypes has less page-faults than the code with set and the same number of branch-misses than the two others. The only thing I see is that there are more instructions and branches (but I still don't know why) and more context switches (but it is certainly a consequence of the longer run time rather than a cause).

I therefore have two questions:

  1. Why is ctypes so slow ?
  2. Is there a way to improve performances, either with ctype or with another library?
like image 376
Tom Cornebize Avatar asked Aug 30 '16 10:08

Tom Cornebize


People also ask

Why we use ctypes in Python?

ctypes is a foreign function library for Python. It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python.

Does Numpy use ctypes?

Numpy contains some support for interfacing with ctypes. In particular there is support for exporting certain attributes of a Numpy array as ctypes data-types and there are functions to convert from C arrays to Numpy arrays and back.

Is ctypes a built in Python?

The built-in ctypes module is a powerful feature in Python, allowing you to use existing libraries in other languages by writting simple wrappers in Python itself. Unfortunately it can be a bit tricky to use. In this article we'll explore some of the basics of ctypes .


2 Answers

The solution is to use the array module and cast the address or use the from_buffer method...

import timeit
setup="from array import array; import ctypes; t = [i for i in range(1000000)];"
print(timeit.timeit(stmt="v = array('I',t);assert v.itemsize == 4; addr, count = v.buffer_info();p = ctypes.cast(addr,ctypes.POINTER(ctypes.c_uint32))",setup=setup,number=10))
print(timeit.timeit(stmt="v = array('I',t);a = (ctypes.c_uint32 * len(v)).from_buffer(v)",setup=setup,number=10))
print(timeit.timeit(stmt='(ctypes.c_uint32 * len(t))(*t)',setup=setup,number=10))
print(timeit.timeit(stmt='set(t)',setup=setup,number=10))

It is then many times faster when using Python 3:

$ python3 convert.py
0.08303386811167002
0.08139665238559246
1.5630637975409627
0.3013848252594471
like image 115
Daniel Lemire Avatar answered Oct 11 '22 13:10

Daniel Lemire


While this is not a definitive answer, the problem seems to be the constructor call with *t. Doing the following instead, decreases the overhead significantly:

array =  (ctypes.c_uint32 * len(t))()
array[:] = t

Test:

import timeit
setup="from array import array; import ctypes; t = [i for i in range(1000000)];"
print(timeit.timeit(stmt='(ctypes.c_uint32 * len(t))(*t)',setup=setup,number=10))
print(timeit.timeit(stmt='a = (ctypes.c_uint32 * len(t))(); a[:] = t',setup=setup,number=10))
print(timeit.timeit(stmt='array("I",t)',setup=setup,number=10))
print(timeit.timeit(stmt='set(t)',setup=setup,number=10))

Output:

1.7090932869978133
0.3084979929990368
0.08278547400186653
0.2775516299989249
like image 25
code_onkel Avatar answered Oct 11 '22 14:10

code_onkel