Domanda

As noted by in the comments by @benjamin-gruenbaum this is called the Boolean trap:

Say I have a function like this

UpdateRow(var item, bool externalCall);

and in my controller, that value for externalCall will always be TRUE. What is the best way to call this function? I usually write

UpdateRow(item, true);

But I ask my self, should I declare a boolean, just to indicate what that 'true' value stands for? You can know that by looking the declaration of the function, but it's obviously faster and clearer if you just saw something like

bool externalCall = true;
UpdateRow(item, externalCall);

PD: Not sure if this question really fits here, if it doesn't, where could I get more info about this?

PD2: I didn't tag any language 'cause I thought it was a very generic problem. Anyway, I work with c# and the accepted answer works for c#

È stato utile?

Soluzione

There isn't always a perfect solution, but you have many alternatives to choose from:

  • Use named arguments, if available in your language. This works very well and has no particular drawbacks. In some languages, any argument can be passed as a named argument, e.g. updateRow(item, externalCall: true) (C#) or update_row(item, external_call=True) (Python).

    Your suggestion to use a separate variable is one way to simulate named arguments, but does not have the associated safety benefits (there's no guarantee that you used the correct variable name for that argument).

  • Use distinct functions for your public interface, with better names. This is another way of simulating named parameters, by putting the paremeter values in the name.

    This is very readable, but leads to a lot of boilerplate for you, who is writing these functions. It also can't deal well with combinatorial explosion when there are multiple boolean arguments. A significant drawback is that clients can't set this value dynamically, but must use if/else to call the correct function.

  • Use an enum. The problem with booleans is that they are called "true" and "false". So, instead introduce a type with better names (e.g. enum CallType { INTERNAL, EXTERNAL }). As an added benefit, this increases the type safety of your program (if your language implements enums as distinct types). The drawback of enums is that they add a type to your publicly visible API. For purely internal functions, this doesn't matter and enums have no significant drawbacks.

    In languages without enums, short strings are sometimes used instead. This works, and may even be better than raw booleans, but is very susceptible to typos. The function should then immediately assert that the argument matches a set of possible values.

None of these solutions has a prohibitive performance impact. Named parameters and enums can be resolved completely at compile time (for a compiled language). Using strings may involve a string comparison, but the cost of that is negligible for small string literals and most kinds of applications.

Altri suggerimenti

The correct solution is to do what you suggest, but package it into a mini-facade:

void updateRowExternally() {
  bool externalCall = true;
  UpdateRow(item, externalCall);
}

Readability trumps micro-optimization. You can afford the extra function call, certainly better than you can afford the developer effort of having to look up the semantics of the Boolean flag even once.

Say I have a function like UpdateRow(var item, bool externalCall);

Why do you have a function like this?

Under what circumstances would you call it with the externalCall argument set to different values?

If one is from, say, an external, client application and the other is within the same program (i.e. different code modules), then I'd argue that you should have two different methods, one for each case, possibly even defined on distinct Interfaces.

If, however, you were making the call based on some datum taken from an non-program source (say, a configuration file or database read), then the Boolean-passing method would make more sense.

While I agree it's ideal to use a language feature to enforce both readability and value-safety, you can also opt for a practical approach: call-time comments. Like:

UpdateRow(item, true /* row is an external call */);

or:

UpdateRow(item, true); // true means call is external

or (rightly, as suggested by Frax):

UpdateRow(item, /* externalCall */true);
  1. You can 'name' your bools. Below is an example for an OO language (where it can be expressed in the class providing UpdateRow()), however the concept itself can be applied in any language:

    class Table
    {
    public:
        static const bool WithExternalCall = true;
        static const bool WithoutExternalCall = false;
    

    and in the call site:

    UpdateRow(item, Table::WithExternalCall);
    
  2. I believe Item #1 is better but it doesn't force the user use new variables when using the function. If type safety is important for you, you can create an enum type and make UpdateRow() accept this instead of a bool:

    UpdateRow(var item, ExternalCallAvailability ec);

  3. You can modify the name of the function somehow that it reflects the meaning of the bool parameter better. Not very sure but maybe:

    UpdateRowWithExternalCall(var item, bool externalCall)

Another option I haven’t read here yet is: Use a modern IDE.

For example IntelliJ IDEA prints the variable name of the variables in the method you’re calling if you’re passing a literal such as true or null or username + “@company.com. This is done in a small font so it doesn’t take up too much space on your screen and looks very different from actual code.

I'm still not saying it's a good idea to throw in booleans everywhere. The argument saying you read code far more often than you write is often very strong, but for this particular case it does heavily depend on the technology you (and your colleagues!) use to support your job. With an IDE it's far less of a problem than with for example vim.

Modified example of part of my test setup: enter image description here

2 days and no ones mentioned polymorphism?

target.UpdateRow(item);

When I'm a client wanting to update a row with an item I don't want to think about the name of the database, the URL of the micro-service, the protocol used to communicate, or whether or not an external call is going to need to be used to make it happen. Stop pushing these details on me. Just take my item and go update a row somewhere already.

Do that and it becomes part of the construction problem. That can be solved many ways. Here's one:

Target xyzTargetFactory(TargetBuilder targetBuilder) {
    return targetBuilder
        .connectionString("some connection string")
        .table("table_name")
        .external()
        .build()
    ; 
}

If you're looking at that and thinking "But my language has named arguments, I don't need this". Fine, great, go use them. Just keep this nonsense out of the calling client that shouldn't even know what it's talking to.

It should be noted that this is more than a semantic problem. It's also a design problem. Pass in a boolean and you are forced to use branching to deal with it. Not very object oriented. Have two or more booleans to pass in? Wishing your language had multiple dispatch? Look into what collections can do when you nest them. The right data structure can make your life so much easier.

If you can modify the updateRow function, perhaps you can refactor it into two functions. Given that the function takes a boolean parameter, I suspect it looks like this:

function updateRow(var item, bool externalCall) {
  Database.update(item);

  if (externalCall) {
    Service.call();
  }
}

That can be a bit of a code smell. The function might have dramatically different behavior depending on what the externalCall variable is set to, in which case it has two different responsibilities. Refactoring it into two functions that only have one responsibility can improve readability:

function updateRowAndService(var item) {
  updateRow(item);
  Service.call();
}

function updateRow(var item) {
  Database.update(item);
}

Now, anywhere you call these functions, you can tell at a glance whether the external service is being called.

This isn't always the case, of course. It is situational and a matter of taste. Refactoring a function that takes a boolean parameter into two functions is usually worth considering, but isn't always the best choice.

If UpdateRow is in a controlled codebase, I would consider strategy pattern:

public delegate void Request(string query);    
public void UpdateRow(Item item, Request request);

Where Request represents some kind of DAO (a callback in trivial case).

True case:

UpdateRow(item, query =>  queryDatabase(query) ); // perform remote call

False case:

UpdateRow(item, query => readLocalCache(query) ); // skip remote call

Usually, endpoint implementation would be configured on a higher level of abstraction and I would just use it here. As a useful free side-effect, this gives me an option to implement another way to access remote data, without any change to the code:

UpdateRow(item, query => {
  var data = readLocalCache(query);
  if (data == null) {
    data = queryDatabase(query);
  }
  return data;
} );

In general, such inversion of control ensures lower coupling between your data storage and model.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
scroll top