Why do we need a Builder class when implementing a Builder pattern?

https://softwareengineering.stackexchange.com/questions/380397

14-02-2021
|

Question

I have seen many implementations of the Builder pattern (mainly in Java). All of them have an entity class (let's say a Person class), and a builder class PersonBuilder. The builder "stacks" a variety of fields and returns a new Person with the arguments passed. Why do we explicitly need a builder class, instead of putting all the builder methods in the Person class itself?

For example:

class Person {

  private String name;
  private Integer age;

  public Person() {
  }

  Person withName(String name) {
    this.name = name;
    return this;
  }

  Person withAge(int age) {
    this.age = age;
    return this;
  }
}

I can simply say Person john = new Person().withName("John");

Why the need for a PersonBuilder class?

The only benefit I see, is we can declare the Person fields as final, thus ensuring immutability.

Solution

It's so you can be immutable AND simulate named parameters at the same time.

Person p = personBuilder
    .name("Arthur Dent")
    .age(42)
    .build()
;

That keeps your mitts off the person until it's state is set and, once set, won't let you change it, yet every field is clearly labeled. You can't do this with just one class in Java.

It looks like you're talking about Josh Blochs Builder Pattern. This should not be confused with the Gang of Four Builder Pattern. These are different beasts. They both solve construction problems, but in fairly different ways.

Of course you can construct your object without using another class. But then you have to choose. You lose either the ability to simulate named parameters in languages that don't have them (like Java) or you lose the ability to remain immutable throughout the objects lifetime.

Immutable example, has no names for parameters

Person p = new Person("Arthur Dent", 42);

Here you're building everything with a single simple constructor. This will let you stay immutable but you loose the simulation of named parameters. That gets hard to read with many parameters. Computers don't care but it's hard on the humans.

Simulated named parameter example with traditional setters. Not immutable.

Person p = new Person();
p.name("Arthur Dent");
p.age(42);

Here you're building everything with setters and are simulating named parameters but you're no longer immutable. Each use of a setter changes object state.

So what you get by adding the class is you can do both.

Validation can be performed in the build() if a runtime error for a missing age field is enough for you. You can upgrade that and enforce that age() is called with a compiler error. Just not with the Josh Bloch builder pattern.

For that you need an internal Domain Specific Language (iDSL).

This lets you demand that they call age() and name() before calling build(). But you can't do it just by returning this each time. Each thing that returns returns a different thing that forces you to call the next thing.

Use might look like this:

Person p = personBuilder
    .name("Arthur Dent")
    .age(42)
    .build()
;

But this:

Person p = personBuilder
    .age(42)
    .build()
;

causes a compiler error because age() is only valid to call on the type returned by name().

These iDSLs are extremely powerful (JOOQ or Java8 Streams for example) and are very nice to use, especially if you use an IDE with code completion, but they are a fair bit of work to set up. I'd recommend saving them for things that will have a fair bit of source code written against them.

OTHER TIPS

Why use/provide a builder class:

To make immutable objects — the benefit you've identified already. Useful if the construction takes multiple steps. FWIW, immutability should be seen a significant tool in our quest to write maintainable and bug free programs.
If the runtime representation of the final (possibly immutable) object is optimized for reading and/or space usage, but not for update. String and StringBuilder are good examples here. Repeatedly concatenating strings is not very efficient, so the StringBuilder uses a different internal representation that is good for appending — but not as good on space usage, and not as good for reading and using as the regular String class.
To clearly separate constructed objects from objects under construction. This approach requires a clear transition from under-construction to constructed. For the consumer, there is no way to confuse an under-construction object with a constructed object: the type system will enforce this. That means sometimes we can use this approach to "fall into the pit of success", as it were, and, when making abstraction for others (or ourselves) to use (like an API or a layer), this can be a very good thing.

One reason would be to ensure that all of the passed-in data follows business rules.

Your example doesn't take this into consideration, but let's say that someone passed in an empty string, or a string consisting of special characters. You would want to do some sort of logic based around making sure that their name is actually a valid name (which is actually a very difficult task).

You could put that all in your Person class, especially if the logic is very small (for example, just making sure that an age is non-negative) but as the logic grows it makes sense to separate it.

A little different angle on this from what I see in other answers.

The withFoo approach here is problematic because they behave like setters but are defined in a way that make it appear the class supports immutability. In Java classes, if a method modifies a property, it's customary to start the method with 'set'. I've never loved it as a standard but if you do something else it's going to surprise people and that's not good. There's another way that you can support immutability with the basic API you have here. For example:

class Person {
  private final String name;
  private final Integer age;

  private Person(String name, String age) {
    this.name = name;
    this.age = age;
  }  

  public Person() {
    this.name = null;
    this.age = null;
  }

  Person withName(String name) {
    return new Person(name, this.age);
  }

  Person withAge(int age) {
    return new Person(this.name, age);
  }
}

It doesn't provide much in the way of preventing improperly pr partially constructed objects but does prevent changes to any existing objects. It's probably silly for this kind of thing (and so is the JB Builder). Yes you will create more objects but this is not as expensive as you might think.

You'll mostly see this kind of approach used with concurrent data structures such as CopyOnWriteArrayList. And this hints at why immutability is important. If you want to make your code threadsafe, immutability should almost always be considered. In Java, each thread is allowed to keep a local cache of variable state. In order for one thread to see the changes made in other threads, a synchronized block or other concurrency feature needs to be employed. Any of these will add some overhead to the code. But if your variables are final, there's nothing to do. The value will always be what it was initialized to and therefore all threads see the same thing no matter what.

Re-use builder object

As others have mentioned, immutability and verifying the business logic of all fields to validate the object are the major reasons for a separate builder object.

However re-usability is another benefit. If I want to instantiate many objects that are very similar I can make small changes to the builder object and continue instantiating. No need to recreate the builder object. This reuse allows the builder to act as a template for creating many immutable objects. It is a small benefit, but it could be a useful one.

Builder pattern is used to build/create the object step by step by setting the properties and when all the required fields are set then return the final object using build method. The newly created object is immutable. Main point to note here is that the object is only returned when the final build method is invoked. This insures that all the properties are set to the object and thus the object is not in inconsistent state when it is returned by the builder class.

If we dont use builder class and directly put all the builder class methods to the Person class itself, then we have to first create the Object and then invoke the setter methods on the created object which will lead to the inconsistent state of the object between creation of object and setting the properties.

Thus by using builder class (i.e some external entity other than Person class itself) we insure that the object will never be in the inconsistent state.

Actually, you can have the builder methods on your class itself, and still have immutability. It just means the builder methods will return new objects, instead of modifying the existing one.

This only works if there is a way of getting an initial (valid/useful) object (e.g. from a constructor which sets all the required fields, or a factory method which sets default values), and the additional builder methods then returns modified objects based on the existing one. Those builder methods need to make sure you don't get invalid/inconsistent objects on the way.

Of course, this means you'll have a lot of new objects, and you should not do this if your objects are expensive to create.

I used that in test code for creating Hamcrest matchers for one of my business objects. I don't recall the exact code, but it looked something like this (simplified):

public class CustomerMatcher extends TypeSafeMatcher<Customer> {
    private final Matcher<? super String> nameMatcher;
    private final Matcher<? super LocalDate> birthdayMatcher;

    @Override
    protected boolean matchesSafely(Customer c) {
        return nameMatcher.matches(c.getName()) &&
               birthdayMatcher.matches(c.getBirthday());
    }

    private CustomerMatcher(Matcher<? super String> nameMatcher,
                            Matcher<? super LocalDate> birthdayMatcher) {
        this.nameMatcher = nameMatcher;
        this.birthdayMatcher = birthdayMatcher;
    }

    // builder methods from here on

    public static CustomerMatcher isCustomer() {
        // I could return a static instance here instead
        return new CustomerMatcher(Matchers.anything(), Matchers.anything());
    }

    public CustomerMatcher withBirthday(Matcher<? super LocalDate> birthdayMatcher) {
        return new CustomerMatcher(this.nameMatcher, birthdayMatcher);
    }

    public CustomerMatcher withName(Matcher<? super String> nameMatcher) {
        return new CustomerMatcher(nameMatcher, this.birthdayMatcher);
    }
}

I would then use it like this in my unit tests (with suitable static imports):

assertThat(result, is(customer().withName(startsWith("Paŭlo"))));

Another reason that hasn't been explicitly mentioned here already, is that the build() method can verify that all fields are 'fields contain valid values (either set directly, or derived from other values of other fields), which is probably the most likely failure mode that would otherwise occur.

Another benefit is that your Person object would end up having a more simple lifetime and a simple set of invariant. You know that once you have a Person p, you have a p.name and a valid p.age. None of your methods have to be designed to handle situations like "well what if age is set but not name, or what if name is set but not age?" This reduces the overall complexity of the class.

A builder can also be defined to return an interface or an abstract class. You can use the builder to define the object, and the builder can determine what concrete subclass to return based on what properties are set, or what they are set to, for example.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange