Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it idiomatic rust to accept arguments that `impl Borrow<T>` to abstract over references and values of T? [closed]

Tags:

rust

I find myself writing functions that accepts arguments as Borrow<T> so that it accepts both values and references transparently.

Example:

use std::borrow::Borrow;

#[derive(Debug, Copy)]
struct Point {
    pub x: i32,
    pub y: i32,
}

pub fn manhattan<T, U>(p1: T, p2: U) -> i32
where
    T: Borrow<Point>,
    U: Borrow<Point>,
{
    let p1 = p1.borrow();
    let p2 = p2.borrow();
    (p1.x - p2.x + p1.y - p2.y).abs()
}

That can be useful to implement std:ops like Add, which would otherwise require a lot of repetition to support references transparently.

Is this idiomatic? Are there drawbacks?

like image 593
Penz Avatar asked Nov 06 '22 00:11

Penz


1 Answers

I think there are two parts to this question.

1. Is the Borrow trait the idiomatic way to abstract over ownership in Rust?

Yes. If what you intend is to write a function that either takes a Foo or a &Foo, F: Borrow<Foo> is the right bound to use. AsRef, on the other hand, is usually only implemented for things that are reference-like, and not for owned values.

2. Is it idiomatic in Rust to abstract over ownership at all?

Sometimes. This is an interesting question because there is a subtle but important distinction between a function like manhattan and how Borrow is idiomatically used.

In Rust, whether a function needs to own its arguments or merely borrow them is an important part of the function's interface. Rustaceans, as a rule, don't mind writing & in a function call because it's a syntactic marker of a relevant semantic fact about the function being called. A function that can accept either Point or &Point is no more generally useful than the one that can accept only &Point: if you have a Point, all you have to do is borrow it. So it's idiomatic to use the simpler signature that most accurately documents the type the function really needs: &Point.

But wait! There are other differences between those ways of accepting arguments. One difference is call overhead: a &Point will generally be passed in a single pointer-sized register, while a Point may be passed in multiple registers or on the stack, depending on the ABI. Another difference is code size: each unique instantiation of <T: Borrow<Point>> represents a monomorphization of the function, which bloats the binary. A third difference is drop order: if Point has destructors, a function that accepts T: Borrow<Point> will call Point::drop internally, while a function that accepts &Point will leave the object in place for the caller to deal with. Whether this is good or bad depends on the context; for performance, though, it's usually irrelevant (if you assume the Point will eventually be dropped anyway).

A function accepting T: Borrow<Point> suggests that it's doing something with T internally for which a mere &Point might be suboptimal. Drop order is probably the best reason for doing this (I wrote more about this in this answer, although the puts function I used as an example isn't a particularly strong one).

In the case of manhattan drop order is irrelevant, because Point is Copy (trivially copied types may not have drop glue). So there is no performance advantage from accepting Point as well as &Point (and although a single function isn't likely to make much difference one way or another, if generics are used pervasively, the cost to code size may well be a disadvantage).

There is one more reason to avoid using generics unnecessarily: they interfere with type inference and can decrease the quality of error messages and suggestions from the compiler. For instance, imagine if Point only implemented Clone (not Copy) and you wrote manhattan(p, q) and then used p again later in the same function. The compiler would warn you that p was used after being moved into the function and suggest adding a .clone(). In fact, the better solution is to borrow p, and if manhattan takes references the compiler will enforce that you do just that.

The fact Point is small (so overhead to using it as a function argument is probably minimal) and Copy (so has no drop glue to worry about) raises another question: should manhattan simply accept Point and not use references at all? This is an opinion-based question and really it comes down to which better fits your mental model. Either accept &Point, and use & when a caller has an owned value, or accept Point, and use * when a caller has a reference - there is no hard and fast rule.

What is an appropriate use of Borrow, then?

The argument above strongly depends on the fact that references are easy to take anywhere, so you may as well take them concretely in the caller as abstractly inside the generic function. One time this is not the case is when the borrowed-or-owned type is not passed directly to the function, but wrapped in another generic data structure. Consider sorting a slice of Point-like things by their distance from (0, 0):

fn sort_by_radius<T: Borrow<Point>>(points: &mut [T]) {
    points.sort_by_key(|p| {
        let Point { x, y } = p.borrow();
        x * x + y * y
    });
}

In this case it's definitely not the case that the caller with a &mut [Point] can simply borrow it to get a &mut [&Point]. Yet we would like sort_by_radius to be able to accept both kinds of slices (without writing two functions) so Borrow<Point> comes to the rescue. The difference between sort_by_radius and your version of manhattan is that T is not being passed directly to the function to be immediately borrowed, but is a part of the type that sort_by_radius needs to treat like a Point in order to perform a task ultimately unrelated to borrowing (sorting a slice).

like image 118
trent Avatar answered Nov 25 '22 12:11

trent