I have a function where Rust's/LLVM's optimization fails and leads to a panic (in the release version), while the unoptimized code (debug version) just works fine. If I compare the generated assembly code, I can not even grasp an idea what the optimizer tries to accomplish. (A reason might be that this very function uses inline assembler.) Is there any way to tell Rust to leave certain functions alone during the optimisation, or do I have to switch off all optimizations? Here is the specific function: <pre class="prettyprint"><code>#[naked] pub extern "C" fn dispatch_svc(){ Cpu::save_context(); let mut nr: u32 = 0; unsafe { asm!("ldr r0, [lr, #-4] bic $0, r0, #0xff000000":"=r"(nr)::"r0":"volatile") }; swi_service_routine(nr); Cpu::restore_context_and_return(); } </code></pre>

No you cannot. Rusts' compilation units (the smallest unit the compiler, and thus the optimizer) operates on, are the entire crate. Your only workaround would be to compile this function in an individual crate, compile it, and then include it as a pre-compiled dependency. (Normal rust-dependencies are compiled at the optimisation level of the depender) However: Specifying a different optimisation level for this single function will not solve your problem! Sure, it may work today, but can break again each time the compiler (or optimisation flags) change. Given the function names in your example (<code>Cpu::save_context</code>/<code>restore_context_and_return</code>) the underlying problem you seem to be solving requires adding a proper calling convention to <code>rustc</code>. TL;DR: naked functions are deeply unsafe (My respect, you're a braver person than I am!). The only reliable way to use them is to write only one single <code>asm!()</code> block as the entire function body, nothing else. Mixing <code>asm!</code>, normal Rust and function calls like you are doing is effectively Undefined Behaviour (in the scary C/Nasal-Demon sense of the term) No amount of optimisation-tweaking will change this. 2022-04 update: since originally answering this, a lot has happened in around naked functions. A minimal "constrained" subset of naked functions (see RFC #2972]1) is slated for stabilisation in 1.60 There are also compiler errors to "reject unsupported naked functions", that would trigger for the examples provided here. <hr> Naked functions are still unstable until the Rust authors "get it right". As you have discovered, there are many subtle problems with this. Tracking issue for stabilisation here, superseded in 2022 by the more-limited tracker for "constrained" naked functions. In the naked-fn RFC, under "Motivation", we find: <blockquote> Because the compiler depends on a function prologue and epilogue to maintain storage for local variable bindings, it is generally unsafe to write anything but inline assembly inside a naked function. The LLVM language reference describes this feature as having "very system-specific consequences", which the programmer must be aware of. </blockquote> (emphasis mine) A little bit lower in the RFC, under unresolved questions, we learn that this is not just a problem for Rust. Other languages also experience problems with this feature: <blockquote> .. most compilers supporting similar features either require or strongly recommend that authors write only inline assembly inside naked functions to ensure no code is generated that assumes a particular stack layout. </blockquote> The reason is that all compilers make a LOT of assumptions about how functions are called (keywords: "Caller-Saved Registers", "Callee-saved registers", "Calling convention", "Red zone") . Naked functions don't obey these assumptions, and thus any code a compiler generates is highly likely to be wrong. The "solution" is to not let the compiler generate anything, i.e. write the entire function by hand in assembly. As such, the way you are mixing 'normal' code (<code>let mut nr: u32 = 0; </code>), function calls (<code>swi_service_routine(nr);</code>) and raw assembler in a naked function is unspecified behaviour. (Yes, such a thing exists in Rust, but only in Unstable). Naked functions cause enough problems that they deserve their own label in the Rust bugtracker. In one of the A-naked issues, we find this comment, by knowledgeable user Tari (among others, author of <code>llvm-sys</code>. He explains: <blockquote> The actual correctness of non-asm code in naked functions depends on the optimizer and code generator, which in general we cannot make any guarantees about what it will do. </blockquote> There is also talk about requiring <code>unsafe</code> for naked functions, as they break many of Rust's normal assumptions. <s>The fact that they don't require this yet in all cases is an open bug</s> 2022 update: Closed on 2022-01-21 by the new deny-by-default lints to Reject unsupported naked functions (#93153) <hr> So, the proper solution to your "optimisation problem" is to stop relying on optimisation at all. Instead, write only a single <code>asm!()</code> block. For your <code>Cpu::save_context()</code> / <code>Cpu::restore_context_and_return()</code> pair: I can understand the desire for code-reuse. To get it, change those into a macro that inserts the relevant <code>asm!(...)</code>. A concatenation of <code>asm!(...); asm!(...); asm!(...);</code> should be equivalent to a single <code>asm!()</code>.

Can I force Rust to not optimize a single function?

Tags:

optimization

rust

llvm-codegen

I have a function where Rust's/LLVM's optimization fails and leads to a panic (in the release version), while the unoptimized code (debug version) just works fine. If I compare the generated assembly code, I can not even grasp an idea what the optimizer tries to accomplish. (A reason might be that this very function uses inline assembler.)

Is there any way to tell Rust to leave certain functions alone during the optimisation, or do I have to switch off all optimizations?

Here is the specific function:

#[naked]
pub extern "C" fn dispatch_svc(){
    Cpu::save_context();
    let mut nr: u32 = 0;
    unsafe {
        asm!("ldr r0, [lr, #-4]
              bic $0, r0, #0xff000000":"=r"(nr)::"r0":"volatile")
    };
    swi_service_routine(nr);
    Cpu::restore_context_and_return();
}

211

asked May 23 '17 14:05

Matthias

1 Answers

No you cannot.
Rusts' compilation units (the smallest unit the compiler, and thus the optimizer) operates on, are the entire crate.

Your only workaround would be to compile this function in an individual crate, compile it, and then include it as a pre-compiled dependency. (Normal rust-dependencies are compiled at the optimisation level of the depender)

However: Specifying a different optimisation level for this single function will not solve your problem! Sure, it may work today, but can break again each time the compiler (or optimisation flags) change. Given the function names in your example (Cpu::save_context/restore_context_and_return) the underlying problem you seem to be solving requires adding a proper calling convention to rustc.

TL;DR: naked functions are deeply unsafe (My respect, you're a braver person than I am!). The only reliable way to use them is to write only one single asm!() block as the entire function body, nothing else.

Mixing asm!, normal Rust and function calls like you are doing is effectively Undefined Behaviour (in the scary C/Nasal-Demon sense of the term) No amount of optimisation-tweaking will change this.

2022-04 update: since originally answering this, a lot has happened in around naked functions.
A minimal "constrained" subset of naked functions (see RFC #2972]1) is slated for stabilisation in 1.60 There are also compiler errors to "reject unsupported naked functions", that would trigger for the examples provided here.

Naked functions are still unstable until the Rust authors "get it right". As you have discovered, there are many subtle problems with this.
Tracking issue for stabilisation here, superseded in 2022 by the more-limited tracker for "constrained" naked functions.

In the naked-fn RFC, under "Motivation", we find:

Because the compiler depends on a function prologue and epilogue to maintain storage for local variable bindings, it is generally unsafe to write anything but inline assembly inside a naked function. The LLVM language reference describes this feature as having "very system-specific consequences", which the programmer must be aware of.

(emphasis mine)

A little bit lower in the RFC, under unresolved questions, we learn that this is not just a problem for Rust. Other languages also experience problems with this feature:

.. most compilers supporting similar features either require or strongly recommend that authors write only inline assembly inside naked functions to ensure no code is generated that assumes a particular stack layout.

The reason is that all compilers make a LOT of assumptions about how functions are called (keywords: "Caller-Saved Registers", "Callee-saved registers", "Calling convention", "Red zone") . Naked functions don't obey these assumptions, and thus any code a compiler generates is highly likely to be wrong. The "solution" is to not let the compiler generate anything, i.e. write the entire function by hand in assembly.

As such, the way you are mixing 'normal' code (let mut nr: u32 = 0; ), function calls (swi_service_routine(nr);) and raw assembler in a naked function is unspecified behaviour. (Yes, such a thing exists in Rust, but only in Unstable).

Naked functions cause enough problems that they deserve their own label in the Rust bugtracker. In one of the A-naked issues, we find this comment, by knowledgeable user Tari (among others, author of llvm-sys. He explains:

The actual correctness of non-asm code in naked functions depends on the optimizer and code generator, which in general we cannot make any guarantees about what it will do.

There is also talk about requiring unsafe for naked functions, as they break many of Rust's normal assumptions. ~~The fact that they don't require this yet in all cases is an open bug~~
2022 update: Closed on 2022-01-21 by the new deny-by-default lints to Reject unsupported naked functions (#93153)

So, the proper solution to your "optimisation problem" is to stop relying on optimisation at all. Instead, write only a single asm!() block.

For your Cpu::save_context() / Cpu::restore_context_and_return() pair: I can understand the desire for code-reuse. To get it, change those into a macro that inserts the relevant asm!(...). A concatenation of asm!(...); asm!(...); asm!(...); should be equivalent to a single asm!().

answered Sep 29 '22 20:09

Jules Kerssemakers

Related questions
                            
                                C++ massive performance loss because of if statement
                            
                                Optimization! - What is it? How is it done?
                            
                                Planning for efficiency early vs Premature optimization
                            
                                When is loop unwinding effective?
                            
                                Compiler Magic: Why?
                            
                                Improve Java code: too many if's
                            
                                C++ example of Coding Horror or Brilliant Idea?
                            
                                Optimizing SHA256 for mobile Safari
                            
                                How to implement a generic neural network efficiently in Haskell?
                            
                                Is there any workaround to "reserve" a cache fraction?
                            
                                Numpy mean of flattened large array slower than mean of mean of all axes
                            
                                Strange behaviour of the Hotspot loop condition optimizer
                            
                                optimization - stepping may behave oddly : iOS/Unity
                            
                                Why can't my DQN agent find the optimal policy in a non-deterministic environment?
                            
                                Performance-wise: request JSON and render in JS, or request the entire HTML? [duplicate]
                            
                                Quadratic programming in Haskell
                            
                                DNS prefetching and page optimization [duplicate]
                            
                                Is it possible to write a zero-cost exception handling in C?
                            
                                How to understand the tricky speed up
                            
                                Maximize profit in scheduling unit tasks with dependencies

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With