Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Rust allow calling functions via null pointers?

Tags:

rust

I was experimenting with function pointer magic in Rust and ended up with a code snippet which I have absolutely no explanation for why it compiles and even more, why it runs.

fn foo() {
    println!("This is really weird...");
}

fn caller<F>() where F: FnMut() {
    let closure_ptr = 0 as *mut F;
    let closure = unsafe { &mut *closure_ptr };
    closure();
}

fn create<F>(_: F) where F: FnMut() {
    caller::<F>();
}

fn main() {
    create(foo);
    
    create(|| println!("Okay..."));
    
    let val = 42;
    create(|| println!("This will seg fault: {}", val));
}

I cannot explain why foo is being invoked by casting a null pointer in caller(...) to an instance of type F. I would have thought that functions may only be called through corresponding function pointers, but that clearly can't be the case given that the pointer itself is null. With that being said, it seems that I clearly misunderstand an important piece of Rust's type system.

Example on Playground

like image 432
Basic Coder Avatar asked Jul 30 '20 01:07

Basic Coder


People also ask

Does Rust have NULL pointers?

Rust proves that you are not using invalid pointers. Rust proves that you never have null-pointers. Rust proves that you never have two pointers to the same object, execept if all these pointers are non-mutable (const).

How does Rust handle null?

Note: NULL is a value that is currently invalid or absent. Although, Rust does not have NULL, but provides an enum Option which can encode the concept of a value being present or absent. Here, is a part of Rust Generics, whatever the datatype we will mention in T, Some(T) will have the value of the same datatype.

Why do we set pointers to null?

A null pointer is a pointer which points nothing. Some uses of the null pointer are: a) To initialize a pointer variable when that pointer variable isn't assigned any valid memory address yet. b) To pass a null pointer to a function argument when we don't want to pass any valid memory address.

What happens if you call a NULL function pointer?

Initializing a pointer to a function, or a pointer in general to NULL helps some developers to make sure their pointer is uninitialized and not equal to a random value, thereby preventing them of dereferencing it by accident.

Can a raw pointer be null in rust?

See also the std::ptr module. Working with raw pointers in Rust is uncommon, typically limited to a few patterns. Raw pointers can be unaligned or null. However, when a raw pointer is dereferenced (using the * operator), it must be non-null and aligned.

Why do we need managed pointers in rust?

Using a managed pointer will activate Rust’s garbage collection mechanism. Borrowed: You’re writing a function, and you need a pointer, but you don’t care about its ownership. If you make the argument a borrowed pointer, callers can send in whatever kind they want. Five exceptions. That’s it. Otherwise, you shouldn’t need them.

Is it possible to pass objects by reference in rust?

If you’re coming to Rust from a language like C or C++, you may be used to passing things by reference, or passing things by pointer. In some langauges, like Java, you can’t even have objects without a pointer to them.

Why does this function return false when a pointer is null?

When this function is used during const evaluation, it may return false for pointers that turn out to be null at runtime. Specifically, when a pointer to some memory is offset beyond its bounds in such a way that the resulting pointer is null, the function will still return false.


5 Answers

This program never actually constructs a function pointer at all- it always invokes foo and those two closures directly.

Every Rust function, whether it's a closure or a fn item, has a unique, anonymous type. This type implements the Fn/FnMut/FnOnce traits, as appropriate. The anonymous type of a fn item is zero-sized, just like the type of a closure with no captures.

Thus, the expression create(foo) instantiates create's parameter F with foo's type- this is not the function pointer type fn(), but an anonymous, zero-sized type just for foo. In error messages, rustc calls this type fn() {foo}, as you can see this error message.

Inside create::<fn() {foo}> (using the name from the error message), the expression caller::<F>() forwards this type to caller without giving it a value of that type.

Finally, in caller::<fn() {foo}> the expression closure() desugars to FnMut::call_mut(closure). Because closure has type &mut F where F is just the zero-sized type fn() {foo}, the 0 value of closure itself is simply never used1, and the program calls foo directly.

The same logic applies to the closure || println!("Okay..."), which like foo has an anonymous zero-sized type, this time called something like [closure@src/main.rs:2:14: 2:36].

The second closure is not so lucky- its type is not zero-sized, because it must contain a reference to the variable val. This time, FnMut::call_mut(closure) actually needs to dereference closure to do its job. So it crashes2.


1 Constructing a null reference like this is technically undefined behavior, so the compiler makes no promises about this program's overall behavior. However, replacing 0 with some other "address" with the alignment of F avoids that problem for zero-sized types like fn() {foo}, and gives the same behavior!)

2 Again, constructing a null (or dangling) reference is the operation that actually takes the blame here- after that, anything goes. A segfault is just one possibility- a future version of rustc, or the same version when run on a slightly different program, might do something else entirely!

like image 171
rpjohnst Avatar answered Oct 21 '22 14:10

rpjohnst


The type of fn foo() {...} is not a function pointer fn(), it's actually a unique type specific to foo. As long as you carry that type along (here as F), the compiler knows how to call it without needing any extra pointers (a value of such a type carries no data). A closure that doesn't capture anything works the same way. It only gets dicey when the last closure tries to look up val because you put a 0 where (presumably) the pointer to val was supposed to be.

You can observe this with size_of, in the first two calls, the size of closure is zero, but in the last call with something captured in the closure, the size is 8 (at least on the playground). If the size is 0, the program doesn't have to load anything from the NULL pointer.

The effective cast of a NULL pointer to a reference is still undefined behavior, but because of type shenanigans and not because of memory access shenanigans: having references that are really NULL is in itself illegal, because memory layout of types like Option<&T> relies on the assumption that the value of a reference is never NULL. Here's an example of how it can go wrong:

unsafe fn null<T>(_: T) -> &'static mut T {
    &mut *(0 as *mut T)
}

fn foo() {
    println!("Hello, world!");
}

fn main() {
    unsafe {
        let x = null(foo);
        x(); // prints "Hello, world!"
        let y = Some(x);
        println!("{:?}", y.is_some()); // prints "false", y is None!
    }
}
like image 38
ben Avatar answered Oct 21 '22 15:10

ben


Given that rust is built on top of LLVM, and that what you're doing is guaranteed UB, you're likely hitting something similar to https://kristerw.blogspot.com/2017/09/why-undefined-behavior-may-call-never.html. This is one of many reasons why safe rust works to eliminate all UB.

like image 4
S Malis Avatar answered Oct 21 '22 14:10

S Malis


Although this is entirely up to UB, here's what I assume might be happening in the two cases:

  1. The type F is a closure with no data. This is equivalent to a function, which means that F is a function item. What this means is that the compiler can optimize any call to an F into a call to whatever function produced F (without ever making a function pointer). See this for an example of the different names for these things.

  2. The compiler sees that val is always 42, and hence it can optimize it into a constant. If that's the case, then the closure passed into create is again a closure with no captured items, and hence we can follow the ideas in #1.

Additionally, I say this is UB, however please note something critical about UB: If you invoke UB and the compiler takes advantage of it in an unexpected way, it is not trying to mess you up, it is trying to optimize your code. UB after all, is about the compiler mis-optimizing things because you've broken some expectations it has. It is hence, completely logical that the compiler optimizes this way. It would also be completely logical that the compiler doesn't optimize this way and instead takes advantage of the UB.

like image 3
Optimistic Peach Avatar answered Oct 21 '22 14:10

Optimistic Peach


This is "working" because fn() {foo} and the first closure are zero-sized types. Extended answer:

If this program ends up executed in Miri (Undefined behaviour checker), it ends up failing because NULL pointer is dereferenced. NULL pointer cannot ever be dereferenced, even for zero-sized types. However, undefined behaviour can do anything, so compiler makes no promises about the behavior, and this means it can break in the future release of Rust.

error: Undefined Behavior: memory access failed: 0x0 is not a valid pointer
  --> src/main.rs:7:28
   |
7  |     let closure = unsafe { &mut *closure_ptr };
   |                            ^^^^^^^^^^^^^^^^^ memory access failed: 0x0 is not a valid pointer
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
           
   = note: inside `caller::<fn() {foo}>` at src/main.rs:7:28
note: inside `create::<fn() {foo}>` at src/main.rs:13:5
  --> src/main.rs:13:5
   |
13 |     func_ptr();
   |     ^^^^^^^^^^
note: inside `main` at src/main.rs:17:5
  --> src/main.rs:17:5
   |
17 |     create(foo);
   |     ^^^^^^^^^^^

This issue can be easily fixed by writing let closure_ptr = 1 as *mut F;, then it will only fail on line 22 with the second closure that will segfault.

error: Undefined Behavior: inbounds test failed: 0x1 is not a valid pointer
  --> src/main.rs:7:28
   |
7  |     let closure = unsafe { &mut *closure_ptr };
   |                            ^^^^^^^^^^^^^^^^^ inbounds test failed: 0x1 is not a valid pointer
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
           
   = note: inside `caller::<[closure@src/main.rs:22:12: 22:55 val:&i32]>` at src/main.rs:7:28
note: inside `create::<[closure@src/main.rs:22:12: 22:55 val:&i32]>` at src/main.rs:13:5
  --> src/main.rs:13:5
   |
13 |     func_ptr();
   |     ^^^^^^^^^^
note: inside `main` at src/main.rs:22:5
  --> src/main.rs:22:5
   |
22 |     create(|| println!("This will seg fault: {}", val));
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Why it didn't complain about foo or || println!("Okay...")? Well, because they don't store any data. When referring to a function, you don't get a function pointer but rather a zero-sized type representing that specific function - this helps with monomorphization, as each function is distinct. A structure not storing any data can be created from aligned dangling pointer.

However, if you explicitly say the function is a function pointer by saying create::<fn()>(foo) then the program will stop working.

like image 2
Konrad Borowski Avatar answered Oct 21 '22 13:10

Konrad Borowski