I don't understand the difference between a slice and a reference. What is the difference between &String
and &str
? I read some stuff online that said a reference was a thin pointer and slice is a fat pointer, but I don't know and can't seem to find what those two mean. I know that a slice can coerce into a reference, but how does it do that? What is the Deref
trait?
Slice is a variable-length sequence which stores elements of a similar type, you are not allowed to store different type of elements in the same slice. It is just like an array having an index value and length, but the size of the slice is resized they are not in fixed-size just like an array.
The basic difference between a slice and an array is that a slice is a reference to a contiguous segment of an array. Unlike an array, which is a value-type, slice is a reference type. A slice can be a complete array or a part of an array, indicated by the start and end index.
A slice is a pointer to a block of memory. Slices can be used to access portions of data stored in contiguous memory blocks. It can be used with data structures like arrays, vectors and strings. Slices use index numbers to access portions of data. The size of a slice is determined at runtime.
func make([]T, len, cap) []T can be used to create a slice by passing the type, length and capacity. The capacity parameter is optional and defaults to the length. The make function creates an array and returns a slice reference to it. The values are zeroed by default when a slice is created using make.
In Rust, a slice is a contiguous block of homogeneously typed data of varying length.
What does this mean?
[u8]
is a slice. In memory, this is a block of u8
s. The slice itself is the data. Many times though, people refer to &[u8]
as a slice. A &[u8]
is a pointer to that block of data. That pointer contains two things: a pointer to the data itself, and the length of the data. Since it contains two things, it is therefore called a fat pointer. A &u8
is also a reference (can also be thought of as a pointer in this case *), but we already know that whatever it points to will be a single u8
. Therefore, it is a thin pointer since it only has one element.
You are guaranteed that all the data in a [u8]
is of type u8
.
Since your [u8]
is just defined as a contiguous block of memory of type u8
, there's no compile time definition as to how large it is. Hence, we need to store its length in a pointer to it. We also can't put it on the stack (This translates to: we can't have a local variable that is just a [u8]
**).
Expanding:
[T]
is a slice of T
s. For any given T
, as long as T
is itself a sized type ***, we can imagine a type [T]
. str
is a slice of a string. It is guaranteed to be valid UTF-8 text, and that's everything that separates it from a [u8]
. Rust could have dumped the valid UTF-8 guarantee and just defined everything else in str
as part of [u8]
. Well, since you can't own a slice locally ****, you might be wondering how we create slices.
The answer is that we put the data in something with the size already known, and then borrow slices from that.
Take for example:
let my_array: [u32; 3] = [1, 2, 3];
We can slice my_array
into a [u32]
like so:
let my_slice: [u32] = my_array[..];
But since we can't own a local variable whose size isn't already known, we must put it under a reference:
let my_slice: &[u32] = &my_array[..];
The point of a slice, is that it's a very flexible (barring lifetimes) method of working with contiguous blocks of data, no matter where the data comes from. I could just as easily made my_array
a Vec<u8>
, which is heap-allocated, and it would still have worked.
What is the difference between &String and &str?
&String
is a reference to the whole string object. The string object in Rust is essentially a Vec<u8>
. A Vec
contains a pointer to the data it "contains", so your &String
could be thought of as a &&str
. And, that is why we could do either of the following:
let my_string: String = "Abc".to_string();
let my_str: &str = &my_string[..]; // As explained previously
// OR
let my_str: &str = &*my_string;
The explanation of this brings me to your last question:
What is the deref trait?
The Deref
trait, is a trait which describes the dereference (*
) operator. As you saw above, I was able to do *my_string
. That's because String
implements Deref
, which allows you to dereference the String
. Similarly, I can dereference a Vec<T>
into a [T]
.
Note however, that the Deref
trait is used in more places than just where *
is used:
let my_string: String = "Abc".to_string();
let my_str: &str = &my_string;
If I try to assign a value of type &T
into a place of type &U
, then Rust will try to dereference my T
, as many times as it takes to get a U
, while still keeping at least one reference. Similarly, if I had a &&&&....&&&&T
, and I tried to assign it to a &&&&....&&&&U
, it would still work.
This is called deref coercion: automatically turning a &T
into a &U
, where some amount of *T
would result in a U
.
*const T
and *mut T
are the same size as references, but are treated as opaque by the compiler. The compiler doesn't make any guarantees about what is behind a raw pointer, or that they are even correctly aligned. Hence, they are unsafe to dereference. But since the Deref
trait defines a deref
method which is safe, dereferencing a raw pointer is special, and will not be done automatically either.extern type
s. This also includes struct
s which contain a dynamically sized type as their last member as well, although these are very difficult to correctly construct, but will become easier in the future with the CoerceUnsized
trait. It is possible to invalidate all of this (Except for extern type
s) with the use of the unsized_locals
nightly feature which allows some use of dynamically sized locals.T
, T
's size is known at compile time if T: Sized
. If T: ?Sized
, then its size may not be known at compile time (T: ?Sized
is the most flexible requirement for callers since it accepts anything). Since a slice requires the data inside to be contiguous, and homogenous in size and type, dynamically sized types (Or !Sized
) aren't possible to contain within a slice, or an array, or a Vec<T>
, and maintain O(1)
indexing. While Rust could probably write special code for indexing into a group of dynamically sized types, it currently doesn't. Box<[T]>
, or a Rc<[T]>
. These will deallocate the slice on their own (A Box
when dropped, and a Rc
when all strong and weak references of an Rc
are dropped (The value's destructor is called when all strong references are dropped, but the memory isn't freed until all weak references are gone, too.)).A reference is like a pointer from C (which represents a memory location), but references are never invalid* (i.e. null) and you can't do pointer arithmetic on references. Rust's references are pretty similar to C++'s references. One important motivating reason to use references is to avoid move
ing or clone
ing variables. Let's say you have a function that calculates the sum of a vector (note: this is a toy example, the right way to get the sum of a vector is nums.iter().sum()
)
fn sum(nums: Vec<u32>) -> Option<u32> {
if nums.len() == 0 {
return None;
}
let mut sum = 0;
for num in nums {
sum += num;
}
Some(sum);
}
this function moves the vector, so it is unusable afterward.
let nums = vec!(1,2,3,4,5);
assert_eq!(sum(nums), 15);
assert_eq!(nums[0], 1); //<-- error, nums was moved when we calculated sum
the solution is to pass a reference to a vector
fn sum(nums: &Vec<u32>) -> Option<u32> {
...
}
let nums = vec!(1,2,3,4,5);
assert_eq!(sum(&nums), 15);
assert_eq!(nums[0], 1); // <-- it works!
A slice is "a view into a [contiguous] block of memory represented as a pointer and a length." It can be thought of as a reference to an array (or array-like thing). Part of Rust's safety guarantee is ensuring you don't access elements past the end of your array. To accomplish this, slices are represented internally as a pointer and a length. This is fat compared to pointers, which contain no length information. Similar to the sum example above, if nums
were an array, rather than a Vec
, you would pass a slice to sum()
, rather than the array itself.
A str
is an array of utf-8 encoded characters, and an &str
is a slice of utf-8 encoded characters. String
is a Vec
of utf-8 encoded characters, and String
implements Deref<Target=str>
, which means that an &String
behaves a lot like (coerces to) an &str
. This is similar to how &Vec<u32>
behaves like &[u32]
(Vec implements Deref<Target=[T]>
)
* unless made invalid with unsafe rust
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With