Do we apply DRY principle only to the state of the object or also to its behavior?

StackOverflow https://stackoverflow.com/questions/20477525

  •  30-08-2022
  •  | 
  •  

Frage

I'm struggling to understand DRY principle

1) I know purpose of DRY principle is to avoid duplication/repetition of information, but what does the term information refer to in the context of DRY? Does it refer only to the state of the object ( ie Person entity should only have a single property representing its birth data ) or does the term also refer to behavior of the object ( ie. Dog entity should only have one method representing barking behavior )

2) Question assumes the term information also refers to a behavior:

DRY is all about not repeating the same information, which I interpret as we shouldn't have two or more methods/code snippets doing the exactly same thing. But I assume the term repetition is used more loosely in the context of DRY? Namely, I've seen examples where DRY was also applied to methods/code snippets that have similar behavior ( thus these methods/code snippets didn't do exactly the same thing ) and yet these methods/code snippets were then replaced with a single method/code snippet?!

Thank you

War es hilfreich?

Lösung

1) "information" refers to any piece of code:

  • from a large code snippet (manually calculating the square root of a number all over the place should be replaced with a Sqrt method)
  • to a simple value (using the string "Monday" all over the place, should be replaced with an enumeration of the days in a week).

2) If two methods are similar, then one can argue that some parts of it do the exact same thing. If that's the case, and if it's feasible to generalize the logic in those two methods to solve a generic problem, instead of two specific problems, then they should be refactored in order to comply with the DRY principle.

If it's not feasible to generalize the whole algorithm, then consider at least refactoring the parts that do the exact same thing.

Andere Tipps

The DRY principal exists primarily to make enhancement and maintenance easier. The idea being that every time a duplication is introduced maintainability and extensibility goes down. Here are two examples:

Duplicated Code

public class Dog {

    int barkCount = 0;

    public void bark(){
        println "bark";
        barkCount++;
    }

    public void defendHouse(){
        println "bark";
        barkCount++;
        println "run in circles";
    }

}

While this example is somewhat primitive, you'll see that the logic in bark is duplicated in defendHouse. This is undesirable for several reasons:

  • The code is harder to read (longer, more parts, developer has to notice duplication)
  • The code is harder to maintain (a developer may forget to update defendHouse() logic when they update bark() logic.)

Both of these bullet points are big considerations in long lived software (Hint: all software is long lived) because they are recurring costs that are incurred with each change/read. This is even worse if the duplication happens over greater distances-- duplicated logic may be in different files or packages for example.

Duplicated Data

public class Person {
    String birthDay = null;
    Date birthDate = null;

    public void setBirthDate(Date newDate){
        birthDate = newDate;
        birthDay = newDate.getDayOfWeek();
    }

    public void clearBirthDate(){
        birthDate = null;
        birthDay = null;
    }

    public String getBirthDay(){
        if(newDate == null){
            return null;
        } else {
            return newDate.getDayOfWeek();
        }
    }
}

The issue here is that the birthDay is a subset of birthDate. The biggest issues here are:

  • Data integrity: a developer may fail to update one field when another field changes. It can be difficult to guarantee consistency (for example, if newDate.getDayOfWeek() throws an exception then the fields may get out of sync).
  • Readability: This code is harder to read because a developer has to notice that birthDay and birthDate are associated (but only by convention).

For the sake of completeness, here are the two examples improved and my thoughts on when to violate the DRY principal...

Cleaned up: Duplicated Code

public class Dog {

    int barkCount = 0;

    public void bark(){
        println "bark";
        barkCount++;
    }

    public void defendHouse(){
        bark();
        println "run in circles";
    }

}

Cleaned up: Duplicated Data

public class Person {
    Date birthDate = null;

    public void setBirthDate(Date newDate){
        birthDate = newDate;
    }

    public void clearBirthDate(){
        birthDate = null;
    }

    public String getBirthDay(){
        if(newDate == null){
            return null;
        } else {
            return newDate.getDayOfWeek();
        }
    }

}

Additional Thoughts

So when is it okay to duplicate code/data? This section is going to be heavily based on my experiences/opinions, so be ready to disagree.

  • For very simple code (like simple expressions) duplication may be acceptable. This is only true if the expression is trivial to read, hard to get wrong, and not easily grouped with some logical entity nearby.
  • When the language doesn't support abstractions to remove the duplication. For example, because Java doesn't have closures it can be wearying to remove duplication from comparators and other 'function-objects'. Certain kinds of duplication are common as a result.
  • Once a performance issue has been experienced data duplication may be needed to speed things up.
  • You don't have enough time to get it 'just right'. This point is really more about picking your battles. Some kinds of duplication are more hazardous than others. Often times, a change in requirements can force duplication into a well designed system. The only fix may be expensive. In these circumstances it makes sense to talk to your team/managers and decide how important the fix is.
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top