Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are functions like memchr bound to C implementations rather than being written in pure Rust?

Tags:

c

rust

Functions like memchr seem quite simple but Rust projects use bindings to the C code with only a Rust fallback, rather than implementing them in Rust. Can't memchr be efficiently implemented in Rust?

like image 408
Peter Hall Avatar asked Sep 29 '16 08:09

Peter Hall


3 Answers

They can. If you look at glibc's implementation, it will look somewhat like fallback:memchr. However, that's only part of the story. The generic implementation is used only when there isn't a more appropriate one available.

For example, x68-64 has a variant written in assembly. And so do many other architectures, that provide sophisticated instructions.

So to reach the same speed, Rust would have to provide something similar on instruction level, which essentially boils down to the same (or better) assembly. At that point you're just duplicating work. It's already there, no need to create everything anew.

like image 107
Zeta Avatar answered Nov 04 '22 23:11

Zeta


There is an assumption here that the only reason to link to a C library is for efficiency.

I am afraid that you are forgetting convenience here. Just because a function could be implemented as efficiently in Rust as in C (possibly leveraging unsafe code and assembly) does not mean that it is convenient to do so.

Rather than attempt to produce an optimized implementation for each and every platform under the sun, it is simply more convenient to be able to fallback to the already provided C function to start with, and then gradually tune for the platforms you care for if necessary.

Producing an implementation that is tailored specifically to the hardware/OS is a lengthy job, and if someone already poured in the effort it might make sense to just reuse its result!

like image 29
Matthieu M. Avatar answered Nov 04 '22 22:11

Matthieu M.


They can, but according to documentation in Rust's GitHub this is for performance reasons:

memchr reduces to super-optimized machine code at around an order of magnitude faster than haystack.iter().position(|&b| b == needle). (See benchmarks.)

Since benchmarks indicate that memchr() from C-bindings results in a far better performance, that version is the preferred one. Trying to optimize a Rust variant further would probably result in including some assembly, so it would boil down to the same thing.

like image 36
ljedrz Avatar answered Nov 04 '22 23:11

ljedrz