I'm trying to get a C string returned by a C library and convert it to a Rust string via FFI.
mylib.c
const char* hello(){ return "Hello World!"; }
main.rs
#![feature(link_args)] extern crate libc; use libc::c_char; #[link_args = "-L . -I . -lmylib"] extern { fn hello() -> *c_char; } fn main() { //how do I get a str representation of hello() here? }
To convert any type to a String is as simple as implementing the ToString trait for the type. Rather than doing so directly, you should implement the fmt::Display trait which automagically provides ToString and also allows printing the type as discussed in the section on print! .
Here are a few different conversions between types. &str to &[u8] : let my_string: &str = "some string"; let my_bytes: &[u8] = my_string. as_bytes();
In easy words, String is datatype stored on heap (just like Vec ), and you have access to that location. &str is a slice type. That means it is just reference to an already present String somewhere in the heap. &str doesn't do any allocation at runtime.
Most C APIs require that the string being passed to them is null-terminated, and by default rust's string types are not null terminated. The other problem with translating Rust strings to C strings is that Rust strings can validly contain a null-byte in the middle of the string (0 is a valid Unicode codepoint).
The best way to work with C strings in Rust is to use structures from the std::ffi
module, namely CStr
and CString
.
CStr
is a dynamically sized type and so it can only be used through a pointer. This makes it very similar to the regular str
type. You can construct a &CStr
from *const c_char
using an unsafe CStr::from_ptr
static method. This method is unsafe because there is no guarantee that the raw pointer you pass to it is valid, that it really does point to a valid C string and that the string's lifetime is correct.
You can get a &str
from a &CStr
using its to_str()
method.
Here is an example:
extern crate libc; use libc::c_char; use std::ffi::CStr; use std::str; extern { fn hello() -> *const c_char; } fn main() { let c_buf: *const c_char = unsafe { hello() }; let c_str: &CStr = unsafe { CStr::from_ptr(c_buf) }; let str_slice: &str = c_str.to_str().unwrap(); let str_buf: String = str_slice.to_owned(); // if necessary }
You need to take into account the lifetime of your *const c_char
pointers and who owns them. Depending on the C API, you may need to call a special deallocation function on the string. You need to carefully arrange conversions so the slices won't outlive the pointer. The fact that CStr::from_ptr
returns a &CStr
with arbitrary lifetime helps here (though it is dangerous by itself); for example, you can encapsulate your C string into a structure and provide a Deref
conversion so you can use your struct as if it was a string slice:
extern crate libc; use libc::c_char; use std::ops::Deref; use std::ffi::CStr; extern "C" { fn hello() -> *const c_char; fn goodbye(s: *const c_char); } struct Greeting { message: *const c_char, } impl Drop for Greeting { fn drop(&mut self) { unsafe { goodbye(self.message); } } } impl Greeting { fn new() -> Greeting { Greeting { message: unsafe { hello() } } } } impl Deref for Greeting { type Target = str; fn deref<'a>(&'a self) -> &'a str { let c_str = unsafe { CStr::from_ptr(self.message) }; c_str.to_str().unwrap() } }
There is also another type in this module called CString
. It has the same relationship with CStr
as String
with str
- CString
is an owned version of CStr
. This means that it "holds" the handle to the allocation of the byte data, and dropping CString
would free the memory it provides (essentially, CString
wraps Vec<u8>
, and it's the latter that will be dropped). Consequently, it is useful when you want to expose the data allocated in Rust as a C string.
Unfortunately, C strings always end with the zero byte and can't contain one inside them, while Rust &[u8]
/Vec<u8>
are exactly the opposite thing - they do not end with zero byte and can contain arbitrary numbers of them inside. This means that going from Vec<u8>
to CString
is neither error-free nor allocation-free - the CString
constructor both checks for zeros inside the data you provide, returning an error if it finds some, and appends a zero byte to the end of the byte vector which may require its reallocation.
Like String
, which implements Deref<Target = str>
, CString
implements Deref<Target = CStr>
, so you can call methods defined on CStr
directly on CString
. This is important because the as_ptr()
method that returns the *const c_char
necessary for C interoperation is defined on CStr
. You can call this method directly on CString
values, which is convenient.
CString
can be created from everything which can be converted to Vec<u8>
. String
, &str
, Vec<u8>
and &[u8]
are valid arguments for the constructor function, CString::new()
. Naturally, if you pass a byte slice or a string slice, a new allocation will be created, while Vec<u8>
or String
will be consumed.
extern crate libc; use libc::c_char; use std::ffi::CString; fn main() { let c_str_1 = CString::new("hello").unwrap(); // from a &str, creates a new allocation let c_str_2 = CString::new(b"world" as &[u8]).unwrap(); // from a &[u8], creates a new allocation let data: Vec<u8> = b"12345678".to_vec(); // from a Vec<u8>, consumes it let c_str_3 = CString::new(data).unwrap(); // and now you can obtain a pointer to a valid zero-terminated string // make sure you don't use it after c_str_2 is dropped let c_ptr: *const c_char = c_str_2.as_ptr(); // the following will print an error message because the source data // contains zero bytes let data: Vec<u8> = vec![1, 2, 3, 0, 4, 5, 0, 6]; match CString::new(data) { Ok(c_str_4) => println!("Got a C string: {:p}", c_str_4.as_ptr()), Err(e) => println!("Error getting a C string: {}", e), } }
If you need to transfer ownership of the CString
to C code, you can call CString::into_raw
. You are then required to get the pointer back and free it in Rust; the Rust allocator is unlikely to be the same as the allocator used by malloc
and free
. All you need to do is call CString::from_raw
and then allow the string to be dropped normally.
In addition to what @vladimir-matveev has said, you can also convert between them without the aid of CStr
or CString
:
#![feature(link_args)] extern crate libc; use libc::{c_char, puts, strlen}; use std::{slice, str}; #[link_args = "-L . -I . -lmylib"] extern "C" { fn hello() -> *const c_char; } fn main() { //converting a C string into a Rust string: let s = unsafe { let c_s = hello(); str::from_utf8_unchecked(slice::from_raw_parts(c_s as *const u8, strlen(c_s)+1)) }; println!("s == {:?}", s); //and back: unsafe { puts(s.as_ptr() as *const c_char); } }
Just make sure that when converting from a &str to a C string, your &str ends with '\0'
. Notice that in the code above I use strlen(c_s)+1
instead of strlen(c_s)
, so s
is "Hello World!\0"
, not just "Hello World!"
.
(Of course in this particular case it works even with just strlen(c_s)
. But with a fresh &str you couldn't guarantee that the resulting C string would terminate where expected.)
Here's the result of running the code:
s == "Hello World!\u{0}" Hello World!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With