Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rust Uninitialized Vec Causes Double Free Error

This is a problem of potential undefined behavior of uninitialized memory in unsafe rust.

  • first, allocating vector by usual safe way (type Vec<Vec<usize>>, or must be vector of something on heap instead of stack; in other words, Vec<usize> type will not panic here);
  • second, clone the vector in a new scope;
  • third, allocating a new vector by unsafe uninitialized way (without using MaybeUninit); then a double free occurs.

The code (playground) listed as follows:

#[deny(clippy::uninit_vec)]
unsafe fn uninitialized_vec<T>(size: usize) -> Vec<T> {
    let mut v: Vec<T> = Vec::with_capacity(size);
    unsafe { v.set_len(size) };
    v
}

fn case_1() {
    println!("=== Case 1 ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 4];
    let _ = vec_a.clone();
    let _: Vec<Vec<usize>> = unsafe { uninitialized_vec(4) };
}

fn case_2() {
    println!("=== Case 2 ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 4];
    {
        let _ = vec_a.clone();
    }
    let _: Vec<Vec<usize>> = unsafe { uninitialized_vec(4) };
}

fn main() {
    case_1();
    case_2();
}
  • debug mode: signal 6 (SIGABRT): abort program, double free detected in tcache 2; case_2 will stuck here;
  • release mode: signal 4 (SIGILL): illegal instruction; even case_1 will stuck here.

From my common sense, the code itself does not intended to double free any variable. We just declared an uninitialized variable, but not using that variable. If this is not a bug, compiler optimization is probably the only reason that can explain this problem.

What I'm curious is that: whether this code actually triggers undefined behavior in unsafe rust, that compiler optimization may cause double free or other problems. And

  • if it is actually a UB, I feel this code can really confuses rust newcomers (that is saying myself).
  • if it is not a UB, is that a bug of unsafe rust?

Additionally, it is known that by using MaybeUninit properly may avoid double-free runtime error like this:

use std::mem::MaybeUninit;

unsafe fn uninitialized_vec<T>(size: usize) -> Vec<MaybeUninit<T>> {
    let mut v: Vec<MaybeUninit<T>> = Vec::with_capacity(size);
    unsafe { v.set_len(size) };
    v
}

fn main() {
    println!("=== This will not cause error ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 12];
    {
        let _ = vec_a.clone();
    }
    let _: Vec<MaybeUninit<Vec<usize>>> = unsafe { uninitialized_vec(12) };
}

Still, MaybeUninit can be inconvenient in many circumstances. Subjectively, I prefer using Vec<T> instead of Vec<MaybeUninit<T>>, especially when I can make sure values of this uninitialized vector will be correctly filled later.

like image 539
ajz34 Avatar asked Dec 02 '25 01:12

ajz34


2 Answers

We just declared an uninitialized variable, but not using that variable

This is incorrect. We are using this variable - namely, we are dropping the outer Vec, therefore dropping all the Vecs in it, and since they are uninitialized, we're trying to drop arbitrary memory.

whether this code actually triggers undefined behavior in unsafe rust

It does, according to documentation for set_len:

Safety
<...>
The elements at old_len..new_len must be initialized.

This requirement is explicitly violated.

undefined behavior in unsafe rust, that compiler optimization may cause double free or other problems

If you trigger undefined behavior, compilation process may do to your program anything at all, by definition. It's not "this code may be misoptimized in this way" - it's "you lied to the compiler, all bets are off".

like image 153
Cerberus Avatar answered Dec 04 '25 06:12

Cerberus


Disclaimer: Cerberus answered the main question, I'll focus on best practices/implicit questions.

Correct usage of set_len

Subjectively, I prefer using Vec<T> instead of Vec<MaybeUninit<T>>, especially when I can make sure values of this uninitialized vector will be correctly filled later.

As mentioned in @Cerberus answer, set_len should only be used after the elements have been initialized.

That is, the correct way to use set_len is:

fn main() {
    const SIZE: usize = 5;

    let mut vec: Vec<Vec<usize>> = Vec::with_capacity(SIZE);

    {
        let spare = vec.spare_capacity_mut();

        (0..SIZE).for_each(|i| { spare[i].write(Vec::new()); });
    }

    //  Safety:
    //  - All elements in 0..SIZE have been initialized.
    unsafe { vec.set_len(SIZE); }
}

Pre-pooping your pants with Rust

Gankra1, one of the lead designer of unsafe Rust in the early days, wrote an article called Pre-pooping your pants with Rust, which is very a-propos here.

The gist of the article, is that one key difficulty for writing sound unsafe code is anticipating all the ways that something could potentially go wrong: any early return, any panic, etc... which could occur before you "fulfill" a promise.

Hint: it's basically impossible, in the face of ever-changing code.

The solution that Grankra proposed is thus to "pre-poop" your pants. For example, when you call vec.drain(start..end), all the elements in start..end should be removed, and only the elements in 0..start and end.. (catenated together) remain. But who knows what could occur while doing all that?

Thus, the code for drain will:

  1. Set the length of vec to start. It will not leave it at the current length, nor set it at the final length.
  2. Poke a hole in the vec, until all elements in start..end have been moved out.
  3. Move the elements in end.. to start.. (filling the hole, partially).
  4. And ONLY THEN set the length of vec to the final number of elements that actually remain.

If for whatever reason the Drain iterator is leaked in the middle of all this, leaving a hole with uninitialized elements... it's all sound, because all the elements in 0..start are (still) initialized, and that's the only promise that vec is making until (4), and at (4) all is good again.

Or put another way: you never promise that you will do something with unsafe, instead you first do it, and then announce it's done.

1 You may know Gankra from Learn Rust With Entirely Too Many Linked Lists.

MaybeUninit is inconvenient

Yes. Yes it is.

In fact, properly written unsafe code, with all the // Safety annotations, is tedious to write. And verbose. That's what it takes to write verifiably sound unsafe code.

If you can't be bothered, then don't use unsafe!

I would note that there's no benefit to using unsafe in the scenarios you've showcased here:

fn defaulted_vec<T>(size: usize) -> Vec<T>
where
   T: Default,
{
    (0..size).map(T::default).collect()
}

fn case_1() {
    println!("=== Case 1 ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 4];
    let _ = vec_a.clone();
    let _: Vec<Vec<usize>> = defaulted_vec(4);
}

fn case_2() {
    println!("=== Case 2 ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 4];
    {
        let _ = vec_a.clone();
    }
    let _: Vec<Vec<usize>> = defaulted_vec(4);
}

fn main() {
    case_1();
    case_2();
}

defaulted_vec will create a Vec of 4 default values, which are cheap -- because Vec::new() doesn't allocate anything -- and you can then use them/overwrite them at leisure.

like image 43
Matthieu M. Avatar answered Dec 04 '25 07:12

Matthieu M.



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!