I recently realized that I could create local functions in Rust (a function within a function). Seems like a good way to clean up my code without polluting the function space of a file. Small sample of what I mean below by local function vs an 'external' function:
fn main() {
fn local_plus(x: i64, y: i64) -> i64 {
x + y
}
let x = 2i64;
let y = 5i64;
let local_res = local_plus(x, y);
let external_res = external_plus(x,y);
assert_eq!(local_res, external_res);
}
fn external_plus(x: i64, y: i64) -> i64 {
x + y
}
I was wondering if there is any negative performance implication of doing this? Like does Rust re-declare the function or take up some undesired amount of function space each time the containing function runs? Or does it have literally no performance implication?
As a bit of an aside, any tips on how I could have found out the answer for myself (either through reading any specific set of documents, or tooling I could use) would be welcome.
There is no impact; I checked the assembly generated for both variants and it is identical.
The two versions I compared:
"external":
fn main() {
let x = 2i64;
let y = 5i64;
let external_res = external_plus(x,y);
}
fn external_plus(x: i64, y: i64) -> i64 {
x + y
}
"local":
fn main() {
fn local_plus(x: i64, y: i64) -> i64 {
x + y
}
let x = 2i64;
let y = 5i64;
let local_res = local_plus(x, y);
}
And both yield the same asm result (release mode in today's nightly):
.text
.file "rust_out.cgu-0.rs"
.section .text._ZN8rust_out4main17hb497928495d48c40E,"ax",@progbits
.p2align 4, 0x90
.type _ZN8rust_out4main17hb497928495d48c40E,@function
_ZN8rust_out4main17hb497928495d48c40E:
.cfi_startproc
retq
.Lfunc_end0:
.size _ZN8rust_out4main17hb497928495d48c40E, .Lfunc_end0-_ZN8rust_out4main17hb497928495d48c40E
.cfi_endproc
.section .text.main,"ax",@progbits
.globl main
.p2align 4, 0x90
.type main,@function
main:
.cfi_startproc
movq %rsi, %rax
movq %rdi, %rcx
leaq _ZN8rust_out4main17hb497928495d48c40E(%rip), %rdi
movq %rcx, %rsi
movq %rax, %rdx
jmp _ZN3std2rt10lang_start17h14cbded5fe3cd915E@PLT
.Lfunc_end1:
.size main, .Lfunc_end1-main
.cfi_endproc
.section ".note.GNU-stack","",@progbits
Which means there will be zero difference (not only performance-wise) in the generated binary.
What is more, it doesn't even matter if you use a function; the following approach:
fn main() {
let x = 2i64;
let y = 5i64;
let res = x + y;
}
Also yields the same assembly.
The bottom line is that, in general, the functions get inlined regardless of whether you declare them in main()
or outside it.
Edit: as Shepmaster pointed out, in this program there are no side effects, so the generated assembly for both variants is actually the same as the one for:
fn main() {}
However, the MIR output for both is the same, too (and different than one for a blank main()
), so there shouldn't be any difference coming from the function location even if side effects were present.
As a bit of an aside, any tips on how I could have found out the answer for myself (either through reading any specific set of documents, or tooling I could use) would be welcome.
Do you know of the Rust playground?
Enter your code, click on "LLVM IR", "Assembly" or "MIR" instead of "Run", and you get to see what is the low-level representation emitted for said code.
I personally prefer LLVM IR (I'm used to reading it from C++), which is still quite higher level than assembly whilst still being post language.
I was wondering if there is any negative performance implication of doing this?
That's a very complicated question; actually.
The only difference between declaring a function locally or externally in Rust is one of scope. Declaring it locally simply reduces its scope. Nothing else.
However... scope, and usage, can have drastic effects on compilation.
A function that is used only once, for example, is much more likely to be inlined than a function that is used 10 times. A compiler cannot easily estimate the number of uses of a pub
function (unbounded), but has perfect knowledge for local or non-pub
functions. And whether a function is inlined or not can drastically affect the performance profile (for worse or better).
So, by reducing the scope, and thereby limiting the usage, you are encouraging the compiler to consider your function for inlining (unless your mark it "cold").
On the other hand, since the scope is reduced, it cannot be shared (obviously).
So... what?
Follow the usage: define an item in the tightest scope possible.
This is encapsulation: now, the next time you need to modify this piece, you will know exactly the impacted scope.
Have some trust in Rust, it won't be introducing overhead if it can avoid it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With