I am writing a Rust project with several libraries. Some of the libraries export types that are consumed by other libraries in the workspace. In addition to the Rust crates, I would also like to expose some of the libraries to Python, using the pyo3
crate to generate Python bindings, and here is where I'm running into trouble.
The issue is as follows.
Suppose we have two Rust library crates, producer
, and consumer
. In producer
, we have a simple type, MyClass
that is publicly available, and is made part of a Python module. In the consumer
crate, I have a few functions that accept objects of type MyClass
, and perform some operations on them. Those functions are available in Rust, and also bound into a second Python module.
I can create objects of MyClass
in both Python and Rust. I can correctly call the functions in Rust code (e.g., from another application) which accept objects of MyClass
. But I cannot call the functions in the consumer
module from Python which accept objects of type MyClass
. In other words, while I can create objects of type MyClass
in Rust or Python and use them in the Rust consumer
crate, I cannot pass the object from the producer
Python module to the consumer
Python module. Doing so generates a TypeError
, despite the object advertising itself as having type MyClass
. Why?
EDIT: Please see the bottom of the question for further investigation.
I have made an MCVE, which is available from GitHub here. The Rust and Python code is also contained below.
After cloning the repo, you can generate the output I get with:
$ cargo build
$ python3 runme.py
You should see:
Object is of type: <class 'MyClass'>
isinstance(obj, MyClass): true
Could not convert object! PyErr { type: Py(0x10d79e5b0, PhantomData) }
Traceback (most recent call last):
File "./runme.py", line 32, in <module>
consumer.print_data(obj)
TypeError
/// producer.rs
use pyo3::prelude::*;
#[pyclass]
#[derive(Debug, Clone)]
pub struct MyClass {
data: u64,
}
#[pymethods]
impl MyClass {
#[new]
fn new(data: u64) -> Self {
MyClass { data }
}
pub fn get_data(&self) -> u64 {
self.data
}
}
#[pymodule]
fn producer(_py: Python, m: &PyModule) -> PyResult<()> {
m.add_class::<MyClass>()?;
Ok(())
}
/// consumer.rs
use pyo3::prelude::*;
use pyo3::wrap_pyfunction;
use producer::MyClass;
#[pyfunction]
fn print_data(cls: &MyClass) {
println!("{}", cls.get_data());
}
#[pyfunction]
fn convert_to_myclass(obj: &PyAny) -> PyResult<()> {
match obj.extract::<MyClass>() {
Ok(_) => println!("Converted to MyClass successfully"),
Err(err) => println!("Could not convert object! {:?}", err),
}
Ok(())
}
#[pyfunction]
fn print_type_info(obj: &PyAny) {
let typ = obj.get_type();
println!("Object is of type: {}", typ);
println!("isinstance(obj, MyClass): {}", typ.is_instance(obj).unwrap());
}
#[pymodule]
fn consumer(_py: Python, m: &PyModule) -> PyResult<()> {
m.add_wrapped(wrap_pyfunction!(print_data))?;
m.add_wrapped(wrap_pyfunction!(print_type_info))?;
m.add_wrapped(wrap_pyfunction!(convert_to_myclass))?;
Ok(())
}
This small Python script demonstrates the issue. The first function is to ensure that the built crates can be imported by the script.
#!/usr/bin/env python3
"""runme.py
MCVE showing showing type weirdness in Python/PyO3.
(C) 2020 Benjamin Naecker
"""
import os
import platform
def link_libraries():
names = ("libproducer", "libconsumer")
lib_extension = ".so" if platform.system() == "Linux" else ".dylib"
base_path = "./target/debug/"
for name in names:
source = os.path.join(base_path, f"{name}{lib_extension}")
new_name = name.replace("lib", "")
dest = f"./{new_name}.so"
if os.path.exists(dest):
os.remove(dest)
os.symlink(source, dest)
if __name__ == "__main__":
link_libraries()
import producer
import consumer
obj = producer.MyClass(10)
consumer.print_type_info(obj)
consumer.convert_to_myclass(obj)
consumer.print_data(obj)
I have been digging more into this, and I'm beginning to suspect that the issue somehow arises from the way Rust libraries are built. I'm familiar with libraries in general, but not so much with any Rust-specifics. It seems though that Rust encodes a hash into every mangled symbol name. My current guess is that these hashes are slightly different between the consumer
shared library and the producer
, so that despite the type of MyClass
having the same textual representation, the actual type expected in the consumer
functions is slightly different.
Here are some details to make this concrete. Listing the symbols in each crate and then demangling them with rustfilt
shows:
$ nm producer.so | grep -e "MyClass.*type_object" | rustfilt -h
0000000000085fa8 d _<producer::MyClass as pyo3::type_object::PyTypeInfo>::type_object_raw::TYPE_OBJECT::h215179c585bab4ba
0000000000021810 t _<producer::MyClass as pyo3::type_object::PyTypeInfo>::type_object_raw::h115c96004643f7df
$ nm consumer.so | grep -e "MyClass.*type_object" | rustfilt -h
0000000000091430 d _<producer::MyClass as pyo3::type_object::PyTypeInfo>::type_object_raw::TYPE_OBJECT::h215179c585bab4ba
0000000000004260 t _<producer::MyClass as pyo3::type_object::PyTypeInfo>::type_object_raw::h0e4c5c91a2345444
0000000000027a70 t _<producer::MyClass as pyo3::type_object::PyTypeInfo>::type_object_raw::h115c96004643f7df
You can see that there is one additional type_obect_raw
in the symbols for the consumer
crate. I'm not sure how to verify this, but I suspect that this is the type information used to convert the object passed to the function that fails in the consumer
crate. This type object, though having the same name, must differ in some way, since the hash is different.
Looking at the pyo3
docs, the method type_object_raw
is used to return the actual PyTypeObject
that represents the type of an object. It seems plausible to me that when constructing an instance of MyClass
from the producer
module, the type object is returned from the symbol type_object_raw::h115c96004643f7df
. But when the functions like consumer::print_data
try to convert the passed instance of MyClass
, they use the symbol type_object_raw::h0e4c5c91a2345444
to get the type of the object. Presumably these are different.
So now my question is, why are there two different symbols for returning the type of an instance of MyClass
?
I have a similar issue, which will generate two symbols with different type info for same pyclass. In my case, I make the pyclass module a standalone crate, and mark it as dylib
to make sure it's only compiled once, then reference it from other crates. This will make sure your pyclass is only compiled ONCE.
Due to rust's compilation model which will compile same library multiple times in different translation unit, everytime compilation happens on pyclass, it will generate a different python type (with same name), and it became really confusing when you have pyo3 complaining about your PyABC object can't convert to PyABC object!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With