Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I take `self` by value or mutable reference when using the Builder pattern?

So far, I've seen two builder patterns in official Rust code and other crates:

impl DataBuilder {
    pub fn new() -> DataBuilder { ... }
    pub fn arg1(&mut self, arg1: Arg1Type) -> &mut Builder { ... }
    pub fn arg2(&mut self, arg2: Arg2Type) -> &mut Builder { ... }
    ...
    pub fn build(&self) -> Data { ... }
}
impl DataBuilder {
    pub fn new() -> DataBuilder { ... }
    pub fn arg1(self, arg1: Arg1Type) -> Builder { ... }
    pub fn arg2(self, arg2: Arg2Type) -> Builder { ... }
    ...
    pub fn build(self) -> Data { ... }
}

I'm writing a new crate and I'm a bit confused which pattern I should choose. I know it will be painful if I change some APIs later, so I want to make the decision now.

I understand the semantic difference between them, but which one should we prefer in practical situations? Or how should we choose between them? Why?

like image 710
Sprite Avatar asked Dec 19 '21 01:12

Sprite


People also ask

Is the Builder Pattern a good choice?

As Joshua Bloch states in Effective Java, 2nd Edition: The builder pattern is a good choice when designing classes whose constructors or static factories would have more than a handful of parameters. We've all at some point encountered a class with a list of constructors where each addition adds a new option parameter: Pizza (int size) { ...

Why do we use the Builder pattern in Java?

The reasons you would use it in Java are also applicable to other programming languages as well. As Joshua Bloch states in Effective Java, 2nd Edition: The builder pattern is a good choice when designing classes whose constructors or static factories would have more than a handful of parameters.

What are the alternatives to the telescoping constructor pattern?

One alternative you have to the Telescoping Constructor Pattern is the JavaBean Pattern where you call a constructor with the mandatory parameters and then call any optional setters after: The problem here is that because the object is created over several calls it may be in an inconsistent state partway through its construction.


1 Answers

Is it beneficial to build multiple values from the same builder?

  • If yes, use &mut self
  • If no, use self

Consider std::thread::Builder which is a builder for std::thread::Thread. It uses Option fields internally to configure how to build the thread:

pub struct Builder {
    name: Option<String>,
    stack_size: Option<usize>,
}

It uses self to .spawn() the thread because it needs ownership of the name. It could theoretically use &mut self and .take() the name out of the field, but then subsequent calls to .spawn() wouldn't create identical results, which is kinda bad design. It could choose to .clone() the name, but then there's an additional and often unneeded cost to spawn a thread. Using &mut self would be a detriment.

Consider std::process::Command which serves as a builder for a std::process::Child. It has fields containing the program, args, environment, and pipe configuration:

pub struct Command {
    program: CString,
    args: Vec<CString>,
    env: CommandEnv,
    stdin: Option<Stdio>,
    stdout: Option<Stdio>,
    stderr: Option<Stdio>,
    // ...
}

It uses &mut self to .spawn() because it does not take ownership of these fields to create the Child. It has to internally copy all that data over to the OS anyway, so there's no reason to consume self. There's also a tangible benefit and use-case to spawning multiple child processes with the same configuration.

Consider std::fs::OpenOptions which serves as a builder for std::fs::File. It only stores basic configuration:

pub struct OpenOptions {
    read: bool,
    write: bool,
    append: bool,
    truncate: bool,
    create: bool,
    create_new: bool,
    // ...
}

It uses &mut self to .open() because it does not need ownership of anything to work. It is somewhat similar to the thread builder since there is a path associated with a file just as there is a name associated with a thread, however, the file path is only passed in to .open() and not stored along with the builder. There's a use-case for opening multiple files with the same configuration.


The considerations above really only cover the semantics of self in the .build() method, but there's plenty of justification that if you pick one method you should use that for the interim methods as well:

  • API consistency
  • chaining (&mut self) -> &mut Self into build(self) obviously wouldn't compile
  • using (self) -> Self into build(&mut self) would limit the flexibility of the builder to be reused long-term
like image 70
kmdreko Avatar answered Oct 24 '22 15:10

kmdreko