I found a lot of information on US-ANSI strings for a Rust DLL implementation in C#, but this does not solve any issues for UTF-8 encoded strings.
For example, "Brötchen"
, once called in C#, results in "Brötchen"
.
Rust
use std::os::raw::c_char;
use std::ffi::CString;
#[no_mangle]
pub extern fn string_test() -> *mut c_char {
let c_to_print = CString::new("Brötchen")
.expect("CString::new failed!");
let r = c_to_print;
r.into_raw()
}
C#
[DllImport(@"C:\Users\User\source\repos\testlib\target\debug\testlib.dll")]
private static extern IntPtr string_test();
public static void run()
{
var s = string_test();
var res = Marshal.PtrToStringAnsi(s);
// var res = Marshal.PtrToStringUni(s);
// var res = Marshal.PtrToStringAuto(s);
// Are resulting in: ????n
Console.WriteLine(res); // prints "Brötchen", expected "Brötchen"
}
How do I get the desired result?
I do not think this is a duplicate of How can I transform string to UTF-8 in C#? because its answers resulting in the same manner as Marshal.PtrToStringAuto(s)
and Marshal.PtrToStringUni(s)
.
The String type is provided in Rust's standard library rather than coded into the core language and is a growable, mutable, owned, UTF-8 encoded string type.
Rust's character and string types are designed around Unicode. String is not a sequence of ASCII chars, instead, it is a sequence of Unicode characters. A Rust char type is a 32-bit value holding a Unicode code.
In order to convert a String into UTF-8, we use the getBytes() method in Java. The getBytes() method encodes a String into a sequence of bytes and returns a byte array. where charsetName is the specific charset by which the String is encoded into an array of bytes.
A String is stored as a vector of bytes ( Vec<u8> ), but guaranteed to always be a valid UTF-8 sequence. String is heap allocated, growable and not null terminated. &str is a slice ( &[u8] ) that always points to a valid UTF-8 sequence, and can be used to view into a String , just like &[T] is a view into Vec<T> .
The answer lies in using Marshal.PtrToStringUTF8, simplifying,
use std::ffi::CString;
#[no_mangle]
pub extern "C" fn string_test() -> *mut c_char {
let s = CString::new("Brötchen").expect("CString::new failed!");
s.into_raw()
}
Then C#
[DllImport(RUSTLIB)] static extern IntPtr string_test();
//...
var encodeText = string_test();
var text = Marshal.PtrToStringUTF8(encodeText);
Console.WriteLine("Decode String : {0}", text);
Thanks to @E_net4's comment recommending to read the Rust FFI Omnibus, I came to an answer that is rather complicated but works.
I figured that I have to rewrite the classes I am using. Furthermore, I am using the libc library and CString
.
Cargo.toml
[package]
name = "testlib"
version = "0.1.0"
authors = ["John Doe <[email protected]>"]
edition = "2018"
[lib]
crate-type = ["cdylib"]
[dependencies]
libc = "0.2.48"
src/lib.rs
extern crate libc;
use libc::{c_char, uint32_t};
use std::ffi::{CStr, CString};
use std::str;
// Takes foreign C# string as input, converts it to Rust String
fn mkstr(s: *const c_char) -> String {
let c_str = unsafe {
assert!(!s.is_null());
CStr::from_ptr(s)
};
let r_str = c_str.to_str()
.expect("Could not successfully convert string form foreign code!");
String::from(r_str)
}
// frees string from ram, takes string pointer as input
#[no_mangle]
pub extern fn free_string(s: *mut c_char) {
unsafe {
if s.is_null() { return }
CString::from_raw(s)
};
}
// method, that takes the foreign C# string as input,
// converts it to a rust string, and returns it as a raw CString.
#[no_mangle]
pub extern fn result(istr: *const c_char) -> *mut c_char {
let s = mkstr(istr);
let cex = CString::new(s)
.expect("Failed to create CString!");
cex.into_raw()
}
C# Class
using System;
using System.Text;
using System.Runtime.InteropServices;
namespace Testclass
{
internal class Native
{
[DllImport("testlib.dll")]
internal static extern void free_string(IntPtr str);
[DllImport("testlib.dll")]
internal static extern StringHandle result(string inputstr);
}
internal class StringHandle : SafeHandle
{
public StringHandle() : base(IntPtr.Zero, true) { }
public override bool IsInvalid
{
get { return false; }
}
public string AsString()
{
int len = 0;
while (Marshal.ReadByte(handle,len) != 0) { ++len; }
byte[] buffer = new byte[len];
Marshal.Copy(handle, buffer, 0, buffer.Length);
return Encoding.UTF8.GetString(buffer);
}
protected override bool ReleaseHandle()
{
Native.free_string(handle);
return true;
}
}
internal class StringTesting: IDisposable
{
private StringHandle str;
private string resString;
public StringTesting(string word)
{
str = Native.result(word);
}
public override string ToString()
{
if (resString == null)
{
resString = str.AsString();
}
return resString;
}
public void Dispose()
{
str.Dispose();
}
}
class Testclass
{
public static string Testclass(string inputstr)
{
return new StringTesting(inputstr).ToString();
}
public static Main()
{
Console.WriteLine(new Testclass("Brötchen")); // output: Brötchen
}
}
}
While this archives the desired result, I am still unsure what causes the wrong decoding in the code provided by the question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With