Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What costs are incurred when using Cell<T> as opposed to just T?

Tags:

rust

I ran across a comment on reddit that indicates that using Cell<T> prevents certain optimizations from occurring:

Cell works with no memory overhead (Cell is the same size as T) and little runtime overhead (it "just" inhibits optimisations, it doesn't introduce extra explicit operations)

This seems counter to other things I've read about Cell<T>, in particular that it's "zero-cost." The first place I encountered this categorization is here.

With all that said, I'd like to understand the actual cost of using Cell<T>, including whatever optimizations it may prevent.

like image 886
w.brian Avatar asked Jan 02 '23 04:01

w.brian


1 Answers

TL;DR Cell is Zero-Overhead Abstraction; that is, the same functionality implemented manually has the same cost.


The term Zero-Cost Abstractions is not English, it's jargon. The idea of Zero-Cost Abstractions is that the layer of abstraction itself does not add any cost compared to manually doing the same thing.

There are various misunderstandings that have sprung up: most notably, I have regularly seen zero-cost understood as "the operation is free", which is not the case.

To add to the confusion, the exception mechanism used by most C++ implementations, and which Rust uses for panic = unwind is called Zero-Cost Exceptions, and purports1 to add no overhead on the non-throwing path. It's a different kind of Zero-Cost...

Lately, my recommendation is to switch to using the term Zero-Overhead Abstractions: first because it's a distinct term from Zero-Cost Exceptions, so less likely to be mistaken, and second because it emphasizes that the Abstraction does not add Overhead, which is what we are trying to convey in the first place.

1The objective is only partially achieved. While the same assembly executed with and without the possibility of throwing indeed has the same performance, the presence of potential exceptions may hinder the optimizer and cause it to generate sub-optimal assembly in the first place.


With all that said, I'd like to understand the actual cost of using Cell<T>, including whatever optimizations it may prevent.

On the memory side, there is no overhead:

  • sizeof::<Cell<T>>() == sizeof::<T>(),
  • given a cell of type Cell<T>, &cell == cell.as_ptr().

(You can peek at the source code)

On the access side, Cell<T> does incur a run-time cost compared to T; the cost of the extra functionality.

The most immediate cost is that manipulating the value through a &Cell<T> requires copying it back and forth1. This is a bitwise copy, so the optimizer may elide it, if it can prove that it is safe to do so.

Another notable cost is that UnsafeCell<T>, on which Cell<T> is based, breaks the rules that &T means that T cannot be modified.

When a compiler can prove that a portion of memory cannot be modified, it can optimize out further reads: read t.foo in a register, then use the register value rather than reading t.foo again.

In traditional Rust code, a &T gives such a guarantee: no matter if there are opaque function calls, calls to C code, etc... between two reads to t.foo, the second read will return the same value as the first, guaranteed. With a &Cell<T>, there is no such guarantee any longer, and thus unless the optimizer can prove beyond doubt that the value is unmodified2, then it cannot apply such optimizations.

1You can manipulate the value at no cost through &mut Cell<T> or using unsafe code.

2For example, if the optimizer knows that the value resides on the stack, and it never passed the address of the value to anyone else, then it can reasonably conclude that no one else can modify the value. Although a stack-smashing attack may, of course.

like image 92
Matthieu M. Avatar answered Jan 05 '23 14:01

Matthieu M.