They're the same. In the specific case of Matlab, don't worry about it: putting length()
or any other function in the for
initialization clause is always just as fast as evaluating it outside the loop, because for
will only call it once either way. Your intuition is probably based on some other languages' for
loops like C and Java, which have different behavior.
By definition, the Matlab for
loop evaluates its argument expressions only once, at the beginning of the loop, to pre-compute a range or array of values for the loop index variable (i
) to take on inside the loop passes. Unlike many other languages, Matlab's for
does not re-evaluate some of the loop control statements each time through the loop. (This is also why assigning to the loop index variable inside the body of a Matlab for
loop has no effect, where in C or Java it will allow you to "jump around" and alter the control flow.)
Have a read through the Matlab for documentation. It could stand to be more explicit about it, but you'll notice it's defined in terms of the values that the expressions resolve to, and not the expressions themselves.
Syntax equivalence
A C for
loop is defined to have this behavior.
/* C-style for loop */
for ( A; B; C; ) {
...
}
/* basically equivalent to: */
{
A;
while ( B ) {
....
C;
}
}
Functionally, Matlab's for
loop syntax equivalence is more like this.
% Matlab for loop
for i = A:B
...
end
% basically functionally equivalent to:
tmp_X = A:B; % A and B only get evaluated outside the loop!
tmp_i = 1; % tmp_* = an implicit variable you have no access to
while tmp_i < size(tmp_X,2)
i = tmp_X(:,tmp_i);
...
tmp_i = tmp_i + 1;
end
And in practice, Matlab can optimize away the creation of the concrete array tmp_X
in the case of primitive values. This decoupling of the loop body from the control expressions also helps support the parallel parfor
loop used with the Parallel Computing Toolbox, because the value of the loop index variable for every loop iteration is known before the loop starts, and independent of the execution of any of the loop passes.
Demonstration
You can confirm this behavior yourself by using a function that has an observable side effect in the loop control clause.
function show_for_behavior
for i = 1:three(NaN)
disp(i);
end
function out = three(x)
disp('three() got called');
out = 3;
You can see there was only one invocation for the whole loop.
>> show_for_behavior
three() got called
1
2
3
Language reasons
Here's where I speculate a bit.
Beyond convenience, I suspect one of the reasons Matlab defines its for
loop the way it does, instead of providing you the C style syntactic sugar over a regular while
loop, is that it's tricky to get the index variable right due to floating-point roundoff. By default, the numeric loop variables you're working with are doubles, and for large values of x
(around 10^15), x + 1 == x
, because the relative precision at x (eps(x)
) is greater than 1.
So if you do the naive while
-loop transformation of for i = A:B ... end
like this, you'll have an infinite loop, because at each step, i = i + 1
will result in the same value of i
due to rounding.
i = A;
while (i < B)
...
i = i + 1;
end
To be able to perform loops over sequences of large values, you can compute the value range and number of steps, keep track of the loop index using a separate integer value, and construct the i
value for each step using that counter and the step size, instead of incrementing a temporary variable on each pass. Something like this.
% original
for x = A:S:B; ...; end
% equivalent
nSteps = int64( ((B - A) / S) ) + int64(1);
i = int64(0);
while i < nSteps
x = A + (S * double(i));
....
i = i + int64(1);
end
You can only do this if the range min, max, and step are defined ahead of time for all passes, which isn't guaranteed with the more flexible while
-loop form.
Note that in this case, for large A and B, x may have the exact same value for multiple iteration passes, but will eventually progress, and you'll get about as many loop iterations as you would expect if you were using infinite-precision values instead of approximate floating-point values. I suspect this is about what Matlab does internally in these cases.
Here's an example showing this behavior.
function big_for_loop(a)
if nargin < 1; a = 1e20; end
b = a + 4 * eps(a);
step = 15;
fprintf('b - a = %f\n', b - a);
fprintf('1 + (b - a) / step = %f\n', 1 + (b - a) / step);
last_i = a;
n = 0;
for i = a : step : b
n = n + 1;
if (i ~= last_i); disp('i advanced'); end
last_i = i;
end
fprintf('niters = %d\n', n);
When I run this, i changes about as you would expect based on eps
if this is how Matlab is doing the loop.
>> big_for_loop
b - a = 65536.000000
1 + (b - a) / step = 4370.066667
i advanced
i advanced
i advanced
i advanced
niters = 4370