Create new object or reset every property?

https://softwareengineering.stackexchange.com/questions/287816

09-10-2020
|

Question

 public class MyClass
    {
        public object Prop1 { get; set; }

        public object Prop2 { get; set; }

        public object Prop3 { get; set; }
    }

Suppose I have an object myObject of MyClass and I need to reset its properties, is it better to create a new object or reassign each property? Assume I don't have any additional use with the old instance.

myObject = new MyClass();

myObject.Prop1 = null;
myObject.Prop2 = null;
myObject.Prop3 = null;

Solution

Instantiating a new object is always better, then you have 1 place to initialise the properties (the constructor) and can easily update it.

Imagine you add a new property to the class, you would rather update the constructor than add a new method that also re-initialises all properties.

Now, there are cases where you might want to re-use an object, one where a property is very expensive to re-initialise and you'd want to keep it. This would be more specialist however, and you'd have special methods to reinitialise all other properties. You'd still want to create a new object sometimes even for this situation.

OTHER TIPS

You should definitely prefer creating a new object in the vast majority of cases. Problems with reassigning all properties:

Requires public setters on all properties, which drastically limits the level of encapsulation you can provide
Knowing whether you have any additional use for the old instance means you need to know everywhere that the old instance is being used. So if class A and class B are both passed instances of class C then they have to know whether they'll ever be passed the same instance, and if so whether the other one is still using it. This tightly couples classes which otherwise have no reason to be.
Leads to repeated code- as gbjbaanb indicated, if you add a parameter to the constructor, everywhere that calls the constructor will fail to compile, there's no danger of missing a spot. If you just add a public property, you'll have to be sure to manually find and update every place where objects are "reset".
Increases complexity. Imagine you're creating and using an instance of the class in a loop. If you use your method, then you now have to do separate initialization the first time through the loop, or before the loop starts. Either way is additional code you have to write to support these two ways of initializing.
Means your class can't protect itself from being in an invalid state. Imagine you wrote a Fraction class with a numerator and denominator, and wanted to enforce that it was always reduced (i.e. the gcd of the numerator and denominator was 1). This is impossible to do nicely if you want to allow people to set the numerator and denominator publicly, as they may transition through an invalid state to get from one valid state to another. E.g. 1/2 (valid) -> 2/2 (invalid) -> 2/3 (valid).
Isn't at all idiomatic for the language you're working in, increasing the cognitive friction for anybody maintaining the code.

These are all pretty significant problems. And what you get in return for the extra work you create is... nothing. Creating instances of objects is, in general, incredibly cheap, so the performance benefit will almost always be totally negligible.

As the other answer mentioned, the only time performance might be a relevant concern is if your class does some significantly expensive work on construction. But even in that case, for this technique to work you'd need to be able to separate out the expensive part from the properties you're resetting, so you'd be able to use the flyweight pattern or similar instead.

As a side note, some of the problems above could be mitigated somewhat by not using setters and instead having a public Reset method on your class which takes the same parameters as the constructor. If for some reason you did want to go down this reset route, that would probably be a much better way of doing it.

Still, the additional complexity and repetition that adds, along with the points above that it doesn't address, are still a very persuasive argument against doing it, especially when weighed up against the nonexistent benefits.

Given the very generic example, it's hard to tell. If "resetting the properties" makes semantic sense in the case of the domain, it will make more sense to the consumer of your class to call

MyObject.Reset(); // Sets all necessary properties to null

Than

MyObject = new MyClass();

I would NEVER require making the consumer of your class call

MyObject.Prop1 = null;
MyObject.Prop2 = null; // and so on

If the class represents something that can be reset, it should expose that functionality through a Reset() method, rather than relying on calling the constructor or manually setting its properties.

As Harrison Paine and Brandin suggest, I would re-use the same object and factorize the initialization of the properties in a Reset method:

public class MyClass
{
    public MyClass() { this.Reset() }

    public void Reset() {
        this.Prop1 = whatever
        this.Prop2 = you name it
        this.Prop3 = oh yeah
    }

    public object Prop1 { get; set; }

    public object Prop2 { get; set; }

    public object Prop3 { get; set; }
}

If the intended usage pattern for a class is that a single owner will keep a reference to each instance, no other code will keep copies of the references, and it will be very common for owners to have loops which need to, many times, "fill in" a blank instance, use it temporarily, and never need it again (a common class meeting such a criterion would be StringBuilder) then it may be useful from a performance standpoint for the class to include a method to reset an instance to like-new condition. Such an optimization isn't likely to be worth much if the alternative would only require creating a few hundred instances, but the cost of creating millions or billions of object instances can add up.

On a related note, there are a few patterns that can be used for methods that need to return data in an object:

Method creates new object; returns reference.
Method accepts a reference to a mutable object, and fills it in.
Method accepts a reference-type variable as a ref parameter and either uses the existing object if suitable, or changes the variable to identify a new object.

The first approach is often the easiest semantically. The second is a little awkward on the calling side, but may offer better performance if a caller will frequently be able to create one object and use it thousands of times. The third approach is a little awkward semantically, but can be helpful if a method will need to return data in an array and the caller won't know the required array size. If the calling code holds the only reference to the array, rewriting that reference with a reference to a larger array will be semantically equivalent to simply making the array bigger (which is the desired behavior semantically). While using List<T> may be nicer than using a manually-resized array in many cases, arrays of structures offer better semantics and better performance than lists of structures.

I think most people answering to favoring creating a new object are missing a critical scenario: garbage collection (GC). GC can have a real performance hit in applications that creating a lot of objects (think games, or scientific applications).

Let's say I have an expression tree that represents a mathematical expression, where in the inner nodes are function nodes (ADD, SUB, MUL, DIV), and the leaf nodes are terminal nodes (X, e, PI). If my node class has a Evaluate() method, it will likely recursively call on the children and collect data via a Data node. At the very least, all the leaf nodes will need to create a Data object and then those can get re-used on the way up the tree until the final value is evaluated.

Now let's say I have thousands of these trees and I'm evaluated them in a loop. All those Data objects are going to trigger a GC and cause a performance hit, a big one (up to 40% CPU utilization loss for some runs in my app -- I've ran a profiler to get the data).

A possible solution? Re-use those data objects and just call some .Reset() on them after you're done using them. The leaf nodes will no longer call 'new Data()', they'll call some Factory method that handle the lifecycle of the object.

I know this because I have an application that has hit this issue and this resolved it.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange