Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When is it appropriate to mark a trait as unsafe, as opposed to marking all the functions in the trait as unsafe?

Tags:

Saying the same thing in code, when would I pick either of the following examples?

unsafe trait MyCoolTrait {
    fn method(&self) -> u8;
}

trait MyCoolTrait {
    unsafe fn method(&self) -> u8;
}

The opt-in builtin traits (OIBIT) RFC states:

An unsafe trait is a trait that is unsafe to implement, because it represents some kind of trusted assertion. Note that unsafe traits are perfectly safe to use. Send and Share (note: now called Sync) are examples of unsafe traits: implementing these traits is effectively an assertion that your type is safe for threading.

There's another example of an unsafe trait in the standard library, Searcher. It says:

The trait is marked unsafe because the indices returned by the next() methods are required to lie on valid utf8 boundaries in the haystack. This enables consumers of this trait to slice the haystack without additional runtime checks.

Unfortunately, neither of these paragraphs really help my understanding of when it is correct to mark the entire trait unsafe instead of some or all of the methods.

I've asked about marking a function as unsafe before, but this seems different.

like image 756
Shepmaster Avatar asked Jul 25 '15 16:07

Shepmaster


2 Answers

A function is marked unsafe to indicate that it is possible to violate memory safety by calling it. A trait is marked unsafe to indicate that it is possible to violate memory safety by implementing it at all. This is commonly because the trait has invariants that other unsafe code relies on being upheld, and that these invariants cannot be expressed any other way.

In the case of Searcher, the methods themselves should be safe to call. That is, users should not have to worry about whether or not they're using a Searcher correctly; the interface contract says all calls are safe. There's nothing you can do that will cause the methods to violate memory safety.

However, unsafe code will be calling the methods of a Searcher, and such unsafe code will be relying on a given Searcher implementation to return offsets that are on valid UTF-8 code point boundaries. If this assumption is violated, then the unsafe code could end up causing a memory safety violation itself.

To put it another way: the correctness of unsafe code using Searchers depends on every single Searcher implementation also being correct. Or: implementing this trait incorrectly allows for safe code to induce a memory safety violation is unrelated unsafe code.

So why not just mark the methods unsafe? Because they aren't unsafe at all! They don't do anything that could violate memory safety in and of themselves. next_match just scans for and returns an Option<(usize, usize)>. The danger only exists when unsafe code assumes that these usizes are valid indices into the string being searched.

So why not just check the result? Because that'd be slower. The searching code wants to be fast, which means it wants to avoid redundant checks. But those checks can't be expressed in the Searcher interface... so instead, the whole trait is flagged as unsafe to warn anyone implementing it that there are extra conditions not stated or enforced in the code that must be respected.

There's also Send and Sync: implementing those when you shouldn't violates the expectations of (among other things) code that has to deal with threads. The code that lets you create threads is safe, but only so long as Send and Sync are only implemented on types for which they're appropriate.

like image 179
DK. Avatar answered Oct 07 '22 02:10

DK.


The rule of thumb will be like this:

  • Use unsafe fn method() if the method user need to wrap method call in unsafe block.

  • Use unsafe trait MyTrait if the trait implementor need to unsafe impl MyTrait.

unsafe is a tip to Rust user: unsafe code must be written carefully. The key point is that unsafe should be used as dual: when an author declare a trait/function as unsafe, the implementor/user need to implement/use it with unsafe.

When function is marked as unsafe, it means the user needs to use the function carefully. The function author is making assumption that the function user must keep.

When trait is marked as unsafe, it means the trait implementor needs to implement carefully. The trait requires implementor keeps certain assumption. But users of the unsafe trait can insouciantly call methods defined in the trait.

For concrete example, unsafe trait Searcher requires all Searcher implementation should return valid utf8 boundary when calling next. And all implementation are marked as unsafe impl Searcher, indicating implementation code might be unsafe. But as a user of Searcher, one can call searcher.next() without wrapping it in unsafe block.

like image 41
Herrington Darkholme Avatar answered Oct 07 '22 00:10

Herrington Darkholme