Optional function parameters: Use default arguments (NULL) or overload the function?

https://stackoverflow.com/questions/703453

22-08-2019
|

Question

I have a function that processes a given vector, but may also create such a vector itself if it is not given.

I see two design choices for such a case, where a function parameter is optional:

Make it a pointer and make it NULL by default:

void foo(int i, std::vector<int>* optional = NULL) {
  if(optional == NULL){
    optional = new std::vector<int>();
    // fill vector with data
  }
  // process vector
}

Or have two functions with an overloaded name, one of which leaves out the argument:

void foo(int i) {
   std::vector<int> vec;
   // fill vec with data
   foo(i, vec);
}

void foo(int i, const std::vector<int>& optional) {
  // process vector
}

Are there reasons to prefer one solution over the other?

I slightly prefer the second one because I can make the vector a const reference, since it is, when provided, only read, not written. Also, the interface looks cleaner (isn't NULL just a hack?). And the performance difference resulting from the indirect function call is probably optimized away.

Yet, I often see the first solution in code. Are there compelling reasons to prefer it, apart from programmer laziness?

Solution

I would definitely favour the 2nd approach of overloaded methods.

The first approach (optional parameters) blurs the definition of the method as it no longer has a single well-defined purpose. This in turn increases the complexity of the code, making it more difficult for someone not familiar with it to understand it.

With the second approach (overloaded methods), each method has a clear purpose. Each method is well-structured and cohesive. Some additional notes:

If there's code which needs to be duplicated into both methods, this can be extracted out into a separate method and each overloaded method could call this external method.
I would go a step further and name each method differently to indicate the differences between the methods. This will make the code more self-documenting.

OTHER TIPS

I would not use either approach.

In this context, the purpose of foo() seems to be to process a vector. That is, foo()'s job is to process the vector.

But in the second version of foo(), it is implicitly given a second job: to create the vector. The semantics between foo() version 1 and foo() version 2 are not the same.

Instead of doing this, I would consider having just one foo() function to process a vector, and another function which creates the vector, if you need such a thing.

For example:

void foo(int i, const std::vector<int>& optional) {
  // process vector
}

std::vector<int>* makeVector() {
   return new std::vector<int>;
}

Obviously these functions are trivial, and if all makeVector() needs to do to get it's job done is literally just call new, then there may be no point in having the makeVector() function. But I'm sure that in your actual situation these functions do much more than what is being shown here, and my code above illustrates a fundamental approach to semantic design: give one function one job to do.

The design I have above for the foo() function also illustrates another fundamental approach that I personally use in my code when it comes to designing interfaces -- which includes function signatures, classes, etc. That is this: I believe that a good interface is 1) easy and intuitive to use correctly, and 2) difficult or impossible to use incorrectly . In the case of the foo() function we are implictly saying that, with my design, the vector is required to already exist and be 'ready'. By designing foo() to take a reference instead of a pointer, it is both intuitive that the caller must already have a vector, and they are going to have a hard time passing in something that isn't a ready-to-go vector.

While I do understand the complaints of many people regarding default parameters and overloads, there seems to be a lack of understanding to the benefits that these features provide.

Default Parameter Values:
First I want to point out that upon initial design of a project, there should be little to no use for defaults if well designed. However, where defaults' greatest assets comes into play is with existing projects and well established APIs. I work on projects that consist of millions of existing lines of code and do not have the luxury to re-code them all. So when you wish to add a new feature which requires an extra parameter; a default is needed for the new parameter. Otherwise you will break everyone that uses your project. Which would be fine with me personally, but I doubt your company or users of your product/API would appreciate having to re-code their projects on every update. Simply, Defaults are great for backwards compatibility! This is usually the reason you will see defaults in big APIs or existing projects.

Function Overrides: The benefit of function overrides is that they allow for the sharing of a functionality concept, but with with different options/parameters. However, many times I see function overrides lazily used to provide starkly different functionality, with just slightly different parameters. In this case they should each have separately named functions, pertaining to their specific functionality (As with the OP's example).

These, features of c/c++ are good and work well when used properly. Which can be said of most any programming feature. It is when they are abused/misused that they cause problems.

Disclaimer:
I know that this question is a few years old, but since these answers came up in my search results today (2012), I felt this needed further addressing for future readers.

A references can't be NULL in C++, a really good solution would be to use Nullable template. This would let you do things is ref.isNull()

Here you can use this:

template<class T>
class Nullable {
public:
    Nullable() {
        m_set = false;
    }
    explicit
    Nullable(T value) {
        m_value = value;
        m_set = true;
    }
    Nullable(const Nullable &src) {
        m_set = src.m_set;
        if(m_set)
            m_value = src.m_value;
    }
    Nullable & operator =(const Nullable &RHS) {
        m_set = RHS.m_set;
        if(m_set)
            m_value = RHS.m_value;
        return *this;
    }
    bool operator ==(const Nullable &RHS) const {
        if(!m_set && !RHS.m_set)
            return true;
        if(m_set != RHS.m_set)
            return false;
        return m_value == RHS.m_value;
    }
    bool operator !=(const Nullable &RHS) const {
        return !operator==(RHS);
    }

    bool GetSet() const {
        return m_set;
    }

    const T &GetValue() const {
        return m_value;
    }

    T GetValueDefault(const T &defaultValue) const {
        if(m_set)
            return m_value;
        return defaultValue;
    }
    void SetValue(const T &value) {
        m_value = value;
        m_set = true;
    }
    void Clear()
    {
        m_set = false;
    }

private:
    T m_value;
    bool m_set;
};

Now you can have

void foo(int i, Nullable<AnyClass> &optional = Nullable<AnyClass>()) {
   //you can do 
   if(optional.isNull()) {

   }
}

I agree, I would use two functions. Basically, you have two different use cases, so it makes sense to have two different implementations.

I find that the more C++ code I write, the fewer parameter defaults I have - I wouldn't really shed any tears if the feature was deprecated, though I would have to re-write a shed load of old code!

I usually avoid the first case. Note that those two functions are different in what they do. One of them fills a vector with some data. The other doesn't (just accept the data from the caller). I tend to name differently functions that actually do different things. In fact, even as you write them, they are two functions:

foo_default (or just foo)
foo_with_values

At least I find this distinction cleaner in the long therm, and for the occasional library/functions user.

I, too, prefer the second one. While there are not much difference between the two, you are basically using the functionality of the primary method in the foo(int i) overload and the primary overload would work perfectly without caring about existence of lack of the other one, so there is more separation of concerns in the overload version.

In C++ you should avoid allowing valid NULL parameters whenever possible. The reason is that it substantially reduces callsite documentation. I know this sounds extreme but I work with APIs that take upwards of 10-20 parameters, half of which can validly be NULL. The resulting code is almost unreadable

SomeFunction(NULL, pName, NULL, pDestination);

If you were to switch it to force const references the code is simply forced to be more readable.

SomeFunction(
  Location::Hidden(),
  pName,
  SomeOtherValue::Empty(),
  pDestination);

I'm squarely in the "overload" camp. Others have added specifics about your actual code example but I wanted to add what I feel are the benefits of using overloads versus defaults for the general case.

Any parameter can be "defaulted"
No gotcha if an overriding function uses a different value for its default.
It's not necessary to add "hacky" constructors to existing types in order to allow them to have default.
Output parameters can be defaulted without needing to use pointers or hacky global objects.

To put some code examples on each:

Any parameter can be defaulted:

class A {}; class B {}; class C {};

void foo (A const &, B const &, C const &);

inline void foo (A const & a, C const & c)
{
  foo (a, B (), c);    // 'B' defaulted
}

No danger of overriding functions having different values for the default:

class A {
public:
  virtual void foo (int i = 0);
};

class B : public A {
public:
  virtual void foo (int i = 100);
};


void bar (A & a)
{
  a.foo ();           // Always uses '0', no matter of dynamic type of 'a'
}

It's not necessary to add "hacky" constructors to existing types in order to allow them to be defaulted:

struct POD {
  int i;
  int j;
};

void foo (POD p);     // Adding default (other than {0, 0})
                      // would require constructor to be added
inline void foo ()
{
  POD p = { 1, 2 };
  foo (p);
}

Output parameters can be defaulted without needing to use pointers or hacky global objects:

void foo (int i, int & j);  // Default requires global "dummy" 
                            // or 'j' should be pointer.
inline void foo (int i)
{
  int j;
  foo (i, j);
}

The only exception to the rule re overloading versus defaults is for constructors where it's currently not possible for a constructor to forward to another. (I believe C++ 0x will solve that though).

I would favour a third option: Separate into two functions, but do not overload.

Overloads, by nature, are less usable. They require the user to become aware of two options and figure out what the difference between them is, and if they're so inclined, to also check the documentation or the code to ensure which is which.

I would have one function that takes the parameter, and one that is called "createVectorAndFoo" or something like that (obviously naming becomes easier with real problems).

While this violates the "two responsibilities for function" rule (and gives it a long name), I believe this is preferable when your function really does do two things (create vector and foo it).

Generally I agree with others' suggestion to use a two-function approach. However, if the vector created when the 1-parameter form is used is always the same, you could simplify things by instead making it static and using a default const& parameter instead:

// Either at global scope, or (better) inside a class
static vector<int> default_vector = populate_default_vector();

void foo(int i, std::vector<int> const& optional = default_vector) {
    ...
}

The first way is poorer because you cannot tell if you accidentally passed in NULL or if it was done on purpose... if it was an accident then you have likely caused a bug.

With the second one you can test (assert, whatever) for NULL and handle it appropriately.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow