I'm coming from C and I wonder whether Rust's bounds checking affects performance. It probably needs some additional assembly instructions for every access, which could hurt when processing lots of data.
On the other hand, the costly thing in processor performance is memory, so more arithmetic assembler instructions might not hurt, but then it might matter that after a cache line is loaded, sequential access should be very fast.
Has somebody benchmarked this?
The obvious advantage of array bounds checking approaches are that they completely eliminate buffer overflow vulnerabilities. However, these are also the most expensive solution, particularly for pointer- and array-intensive programs since every pointer and array operation must be checked.
In other words, since the question is about "safety features", Rust does require run-time bounds checking in safe code, unless the compiler is able to predict the result.
std::array provides many benefits over built-in arrays, such as preventing automatic decay into a pointer, maintaining the array size, providing bounds checking, and allowing the use of C++ container operations.
In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before it is used. It is usually used to ensure that a number fits into a given type (range checking), or that a variable being used as an array index is within the bounds of the array (index checking).
Unfortunately, the cost of a bounds check is not a straightforward thing to estimate. It's certainly not "one cycle per check", or any such easy to guess cost. It will have nonzero impact, but it might be insignificant.
In theory, it would be possible to measure the cost of bounds checking on basic types like Vec
by modifying Rust to disable them and running a large-scale ecosystem test. This would give some kind of rule of thumb, but without doing it, it's quite hard to know whether this will be closer to a ten percent or a tenth of a percent overhead.
There are some ways you can do better than timing and guessing, though. These rules of thumb apply mostly to desktop-class hardware; lower end hardware or something that targets a different niche will have different characteristics.
If your indices are derived from the container size, there is a good chance that the compiler might be able to eliminate the bounds checks entirely. At this point the only cost of the bounds checks in a release build is that it intermittently interferes with optimizations, which could, but normally doesn't, impede other optimizations.
If your code is branchy, memory access heavy or otherwise hard to optimise, and the bounds to check are easy to access, there is a good chance that bounds checking will manage to happen mostly in the CPU's spare bandwidth, with branch prediction helping out specifically, in which case the overall cost will be particularly small, especially compared to the cost of the rest of the code.
If your bounds to check are behind several layers of pointers, it is plausible that you will hit issues with memory latency, and will suffer correspondingly. However, it is also plausible that speculation and prediction machinery in the CPU will manage to hide this; this is very context-dependent. If you are taking references to the data inside, rather than dereferencing it at the same time as the bounds check, this risk magnifies.
If your bounds checks are in a tight arithmetic loop that doesn't saturate the core, you aren't likely to hurt throughput directly except by impeding other compiler optimisations. However, impeding other compiler optimisations can be arbitrarily bad, anywhere from no difference to preventing SIMD and causing a factor-10 slowdown.
If your bounds checks are in a tight arithmetic loop that does saturate the core, you take on the above risk and have a direct execution penalty of roughly half a cycle per bounds check.
If your code is large enough to stress the instruction cache, then you need to worry about the impact on code size. This is normally modest, but is particularly hard to measure the runtime impact of.
Peter Cordes adds some further points in comments. First, bounds checks imply loads and stores, so you're going to be running a mixed load which is most likely to bottleneck on issue/rename. Second, even predicted branches executed in parallel take resources from the predictor, which can cause other branches to predict worse.
This might seem intimidating, and it is. That is why it's important to measure and understand your performance at the level that is relevant for you and your code.
It is also the case that since Rust was "born" with bounds checking, it has produced means to reduce their cost, such as pervasive zero-cost references, iterators (which absorb, but don't actually remove, bounds checks), and an unusual set of nice utility functions. If you find yourself hitting a pathological case, Rust also offers unsafe escape hatches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With