Temporary Variables or Not: Choose CPU Time or RAM? [duplicate]

https://softwareengineering.stackexchange.com/questions/302813

08-12-2020
|

Question

In a case where multiple variables are set equal to an operation using the result of an other operation, is it best to create a temporary variable for the result of the second operation (this would save memory) or is it best to create a temporary variable for the result of the second operation (this would save time)?

For example:

// Option 1 (memory)
totalRevenue += numSold * price;
totalProfit += numSold * price - numSold * cost;
return numSold * price;

// Option 2 (time)
saleRevenue = numSold * price;
totalRevenue += saleRevenue;
totalProfit += saleRevenue - numSold * cost;
return saleRevenue;

I understand that this choice, especially in the example above, is rather trivial. My suspicion is that it makes most sense to create the temporary variable, but I can't help feeling that it is a waste. No, memory isn't very scarce these days, but I feel like Option 1 wastes less time than Option 2 does memory. Am I right?

Why choose one way over the other?

Solution

This optimization is called common subexpression elimination and is built into nearly all optimizers.

The actual draw back of too many eliminations is register pressure, a.k.a. running out of registers to hold all the temporary values. This could mean spilling the temporary to the stack which may incur a cache miss when reading it back in later and cause a massive slowdown in what should be a function that only does a bit of math. Even if simply recalculating would be faster.

Optimizers can convert in both directions as needed. So don't worry about it too much until profiling indicates that the function is a hotspot.

OTHER TIPS

You need to put on more thing in consideration: readability. Option 2 seems much more readable than option 1.

Apart of readability, in this example, I would go to Option 2 for a reason that I'll explain below.

Here is a representation of your code in Java and the bytecode produced:

public class Main {

    public static int totalRevenue;
    public static int totalProfit;

    public static int option1(int numSold, int price, int cost) {
        // Option 1 (memory)
        totalRevenue += numSold * price;
        totalProfit += numSold * price - numSold * cost;
        return numSold * price;
    }

    public static int option2(int numSold, int price, int cost) {
        // Option 2 (time)
        int saleRevenue = numSold * price;
        totalRevenue += saleRevenue;
        totalProfit += saleRevenue - numSold * cost;
        return saleRevenue;
    }
}

Bytecode:

public class Main {
  public static int totalRevenue;

  public static int totalProfit;

  public Main();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static int option1(int, int, int);
    Code:
       0: getstatic     #2                  // Field totalRevenue:I
       3: iload_0
       4: iload_1
       5: imul
       6: iadd
       7: putstatic     #2                  // Field totalRevenue:I
      10: getstatic     #3                  // Field totalProfit:I
      13: iload_0
      14: iload_1
      15: imul
      16: iload_0
      17: iload_2
      18: imul
      19: isub
      20: iadd
      21: putstatic     #3                  // Field totalProfit:I
      24: iload_0
      25: iload_1
      26: imul
      27: ireturn

  public static int option2(int, int, int);
    Code:
       0: iload_0
       1: iload_1
       2: imul
       3: istore_3
       4: getstatic     #2                  // Field totalRevenue:I
       7: iload_3
       8: iadd
       9: putstatic     #2                  // Field totalRevenue:I
      12: getstatic     #3                  // Field totalProfit:I
      15: iload_3
      16: iload_0
      17: iload_2
      18: imul
      19: isub
      20: iadd
      21: putstatic     #3                  // Field totalProfit:I
      24: iload_3
      25: ireturn
}

For the sake of understandability, iload_* load data from stack and istore_* store data into stack.

So you can notice that you really saved one memory allocation, but you spend precious CPU cycles.

But here is the reason: Memory allocations work in blocks, not at byte level. Each time a thread runs, it will allocate a certain amount of memory to use in stack operations. So unless you are lucky enough to have these 4 bytes overflow memory allocation for this block (and requiring you to setup a larger stack size), you will be ending up spending the same memory amount to run both examples. But in the second one, you will spend more cycles doing math calculations.

The only situation where that style matters is if one or more conditions are true:

The "operation" involves a non-trivial, non-purely-arithmetic function call (such as a database lookup, or a linear/binary search for an element inside an array, etc.), or if you have good reasons to believe that the operation takes more than a fraction of a second,
Or, the result of the operation takes up tens or hundreds of megabytes or more,
Or, if you are using a very inefficient, non-optimizing compiler and targeting a very slow computing platform.

(The list is non-exhaustive and may be amended anytime. Anyone with edit-answer privilege are welcome to ~~edit this list without asking. Thanks for your contributions.~~ check the duplicate question first.)

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange