Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ctype why specify argtypes

Tags:

c++

python

ctypes

I want to call c++ library with python.

My C++ library code:

#include <stdio.h>

extern "C" {

int test_i(int i)
{
    return i+1;
}

}

My python code:

from ctypes import *
libc = cdll.LoadLibrary("cpp/libtest.so")
libc.test_f.argtypes = [c_int] # still works when comment
print libc.test_i(1)

My question is: if I did not specify the argtypes, it is still work! Must I specify the argtypes when calling C/C++ function?

like image 485
tidy Avatar asked Dec 12 '22 05:12

tidy


1 Answers

Yes, you should, because why it had worked is because of ctypes' guess. Replace int test_i(int i) with int test_i(char i), and you will get stack corruption — because Python side gives function 4 or 8 bytes, while C side only reads 1 byte. Depending on platform, this may go unnoticed for some time, or gradually eat up your stack, or crash immediately, or nothing will happen — anyway, this code style smells.

ctypes does basic type conversion, but it won't help in lots of cases, because Python type system is much simpler than C's. For example, Python side has no way to figure out if you're passing 1 byte, 2 bytes, 4 bytes or 8 bytes integer, because Python integers have no size. Python has only double precision floats, while C side has also single and half, and may have also platform-specific precisions. Strings and bytes are properly separated on Python side only, while on C side working with Unicode is generally a mess. It's not clear whether you want to use wchar_t[] or char[], how would you like to terminate your strings, and do you need this at all, and how do you want to go around Python string immutability. Without proper argtypes you can't pass in structures and arrays easily.

Moreover, don't also forget to specify restype:

libc.test_f.restype = c_int

By default ctypes assumes all functions return int, whatever it is for your platform. Depending on calling convention it's possible that CPU register will be used for returning result — in this case you won't be harmed, because it makes no difference whether register was read or not. In other case result is passed by stack, and you are screwed, because not consuming enough stack (as well as overconsuming) leads to stack corruption, function's RET will pop wrong address and say hello to SIGSEGV in best case. In worst, hackers will be doing trickery using your application.

Generally, argtypes and restype are your friends, because they protect your code from falling apart in unpredictable ways, and they are needed just because everything is way too much different in C and Python. If it wasn't, nobody would simply force you to use these ever, because of Zen and batteries.

like image 76
toriningen Avatar answered Jan 01 '23 12:01

toriningen