Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is Vec::len a method instead of a public property?

I noticed that Rust's Vec::len method just accesses the vector's len property. Why isn't len just a public property, rather than wrapping a method around it?

I assume this is so that in case the implementation changes in the future, nothing will break because Vec::len can change the way it gets the length without any users of Vec knowing, but I don't know if there are any other reasons.

The second part of my question is about when I'm designing an API. If I am building my own API, and I have a struct with a len property, should I make len private and create a public len() method? Is it bad practice to make fields public in Rust? I wouldn't think so, but I don't notice this being done often in Rust. For example, I have the following struct:

pub struct Segment {
    pub dol_offset: u64,
    pub len: usize,
    pub loading_address: u64,
    pub seg_type: SegmentType,
    pub seg_num: u64,
}

Should any of those fields be private and instead have a wrapper function like Vec does? If so, then why? Is there a good guideline to follow for this in Rust?

like image 563
Addison Avatar asked Jun 02 '18 22:06

Addison


Video Answer


2 Answers

One reason is to provide the same interface for all containers that implement some idea of length. (Such as std::iter::ExactSizeIterator.)

In the case of Vec, len() is acting like a getter:

impl<T> Vec<T> {
    pub fn len(&self) -> usize {
        self.len
    }
}

While this ensures consistency across the standard library, there is another reason underlying this design choice...

This getter protects from external modification of len. If the condition Vec::len <= Vec::buf::cap is not ever satisfied, Vec's methods may try to access memory illegally. For instance, the implementation of Vec::push:

pub fn push(&mut self, value: T) {
    if self.len == self.buf.cap() {
        self.buf.double();
    }
    unsafe {
        let end = self.as_mut_ptr().offset(self.len as isize);
        ptr::write(end, value);
        self.len += 1;
    }
}

will attempt to write to memory past the actual end of the memory owned by the container. Because of this critical requirement, modification to len is forbidden.


Philosophy

It's definitely good to use a getter like this in library code (crazy people out there might try to modify it!).

However, one should design their code in a manner that minimizes the requirement of getters/setters. A class should act on its own members as much as possible. These actions should be made available to the public through methods. And here I mean methods that do useful things -- not just a plain ol' getter/setter that returns/sets a variable. Setters in particular can be made redundant through the use of constructors or methods. Vec shows us some of these "setters":

push
insert
pop
reserve
...

Thus, Vec implements algorithms that provide access to the outside world. But it manages its innards by itself.

like image 185
Mateen Ulhaq Avatar answered Sep 30 '22 17:09

Mateen Ulhaq


The Vec struct looks something like this[1]:

pub struct Vec<T> {
    ptr: *mut T,
    capacity: usize,
    len: usize,
}

The idea is that ptr points at a block of allocated memory of size capacity. If the size of the Vec needs to be bigger than the capacity then new memory is allocated. The unused portion of the allocated memory is uninitialised and could contain arbitrary data.

When you call mutating methods on Vec like push or pop, they carefully manage the Vec's internal state, increase capacity when necessary, and ensure that items that are removed are properly dropped.

If len was a public field, any code with an owned Vec, or a mutable reference to one, could set len to any value. Set it higher than it should be and you'll be able to read from uninitialised memory, causing Undefined Behaviour. Set it lower and you'll be effectively removing elements without properly dropping them.

In some other programming languages (e.g. JavaScript) the API for arrays or vectors specifically lets you change the size by setting a length property. It's not unreasonable to think that a programmer who is used to that approach could do this accidentally in Rust.

Keeping all the fields private and using a getter method for len() allows Vec to protect the mutability of its internals, make strong memory guarantees and prevent users from accidentally doing bad things to themselves.


[1] In practice, there are abstraction layers built over this data structure, so it looks a little different.

like image 30
Peter Hall Avatar answered Sep 30 '22 17:09

Peter Hall