Should array length be stored into a local variable in C#? / Хабр

I notice that people often use construction like this:

var length = array.Length;
for (int i = 0; i < length; i++) {
    //do smth
}

They think that having a call to the Array.Length on each iteration will make CLR to take more time to execute the code. To avoid it they store the length value in a local variable.
Let’s find out (once and for all !) if this is a viable thing or using a temporary variable is a waste of time.

To start, let’s examine these C# methods:

public int WithoutVariable() {
    int sum = 0;
    for (int i = 0; i < array.Length; i++) {
        sum += array[i];
    }
    return sum;
}
public int WithVariable() {
    int sum = 0;
    int length = array.Length;
    for (int i = 0; i < length; i++) {
        sum += array[i];
    }
    return sum;
}

Here is how it looks after been processed by the JIT compiler (for .NET Framework 4.7.2 under LegacyJIT-x86):

WithoutVariable()
;int sum = 0;
    xor  edi, edi
;int i = 0;
    xor  esi, esi
;int[] localRefToArray = this.array;
    mov  edx, dword ptr [ecx+4]
;int arrayLength = localRefToArray.Length;
    mov  ecx, dword ptr [edx+4]
;if (arrayLength == 0) return sum;
    test ecx, ecx
    jle  exit
;int arrayLength2 = localRefToArray.Length;
    mov  eax, dword ptr [edx+4]
;if (i >= arrayLength2)
;  throw new IndexOutOfRangeException();
  loop:
    cmp  esi, eax
    jae  056e2d31
;sum += localRefToArray[i];
    add  edi, dword ptr [edx+esi*4+8]
;i++;
    inc  esi
;if (i < arrayLength) goto loop
    cmp  ecx, esi
    jg  loop
;return sum;
  exit:
    mov  eax, edi

WithVariable()
;int sum = 0;
    xor  esi, esi
;int[] localRefToArray = this.array;
    mov  edx, dword ptr [ecx+4]
;int arrayLength = localRefToArray.Length;
    mov  edi, dword ptr [edx+4]
;int i = 0;
    xor  eax, eax
;if (arrayLength == 0) return sum;
    test edi, edi
    jle  exit
;int arrayLength2 = localRefToArray.Length;
    mov  ecx, dword ptr [edx+4]
;if (i >= arrayLength2)
;  throw new IndexOutOfRangeException();
  loop:
    cmp  eax, ecx
    jae  05902d31
;sum += localRefToArray[i];
    add  esi, dword ptr [edx+eax*4+8]
;i++;
    inc  eax
;if (i < arrayLength) goto loop
    cmp  eax, edi
    jl   loop
;return sum;
  exit:
    mov  eax, esi

Comparison in Meld:

It’s trivial to notice that they have the exact same number of assembler instructions — 15. Even the logic of these instructions is almost the same. There’s a slight difference in the order of initializing variables and comparisons on whether the cycle should continue. We can note that in both cases the array length is registered two times before the cycle:

To check for 0 (arrayLength)
Into the temporary variable for checking the cycle condition (arrayLength2).

It turns out that both methods will compile into the exact same code, but the first one is written faster, even though there isn’t any benefit in terms of execution time.
The assembler code above led me to some thoughts and I decided to check a couple more methods:

public int WithoutVariable() {
    int sum = 0;
    for(int i = 0; i < array.Length; i++) {
        sum += array[i] + array.Length;
    }
    return sum;
}

public int WithVariable() {
    int sum = 0;
    int length = array.Length;
    for(int i = 0; i < length; i++) {
        sum += array[i] + length;
    }
    return sum;
}

Now the current element and array length are being added up, but in the first case the array length is being requested every time, and in the second case it’s saved once into a local variable. Let’s look at the assembler code of these methods:

WithoutVariable()
int sum = 0;
    xor  edi, edi
int i = 0;
    xor  esi, esi
int[] localRefToArray = this.array;
    mov  edx, dword ptr [ecx+4]
int arrayLength = localRefToArray.Length;
    mov  ebx, dword ptr [edx+4]
if (arrayLength == 0) return sum;
    test ebx, ebx
    jle  exit
int arrayLength2 = localRefToArray.Length;
    mov  ecx, dword ptr [edx+4]
if (i >= arrayLength2)
  throw new IndexOutOfRangeException();
  loop:
    cmp  esi, ecx
    jae  05562d39
int t = array[i];
    mov  eax, dword ptr [edx+esi*4+8]
t += sum;
    add  eax, edi
t+= arrayLength;
    add  eax, ebx
sum = t;
    mov  edi, eax
i++;
    inc  esi
if (i < arrayLength) goto loop
    cmp  ebx, esi
    jg   loop
return sum;
  exit:
    mov  eax, edi

WithVariable()
int sum = 0;
    xor  esi, esi
int[] localRefToArray = this.array;
    mov  edx, dword ptr [ecx+4]
int arrayLength = localRefToArray.Length;
    mov  ebx, dword ptr [edx+4]
int i = 0;
    xor  ecx, ecx
if (arrayLength == 0) (return sum;)
    test  ebx, ebx
    jle  exit
int arrayLength2 = localRefToArray.Length;
    mov   edi, dword ptr [edx+4]
if (i >= arrayLength2)
throw new IndexOutOfRangeException();
loop:
    cmp  ecx, edi
    jae  04b12d39
int t = array[i];
    mov  eax, dword ptr [edx+ecx*4+8]
t += sum;
    add  eax, esi
t+= arrayLength;
    add  eax, ebx
sum = t;
    mov  esi, eax
i++;
    inc  ecx
if (i < arrayLength) goto loop
    cmp  ecx, ebx
    jl   loop
return sum;
  exit:
    mov  eax, esi

Comparison in Meld:

Once again, the number of instructions are the same, as well as (almost) the instructions themselves. The only difference is the order of initializing variables and the check condition for continuation of the cycle. You can note that in the calculation of sum, only first length of array is taken into account. It’s obvious that this:

int arrayLength2 = localRefToArray.Length;
    mov     edi, dword ptr [edx+4]
if (i >=arrayLength2) throw new IndexOutOfRangeException();
    cmp     ecx, edi
    jae     04b12d39

in all four methods is an inlined array bounds checking and it’s executed for each element of the array.

We can already make the first conclusion: using an extra variable to try to speed up the cycle is a waste of time, since the compiler will do it for you anyway. The only reason to store a length array into a variable is to make the code more readable.

ForEach is another situation entirely. Consider the following three methods:

public int ForEachWithoutLength() {
    int sum = 0;
    foreach (int i in array) {
        sum += i;
    }
    return sum;
}

public int ForEachWithLengthWithoutLocalVariable() {
    int sum = 0;
    foreach (int i in array) {
        sum += i + array.Length;
    }
    return sum;
}

public int ForEachWithLengthWithLocalVariable() {
    int sum = 0;
    int length = array.Length;
    foreach (int i in array) {
        sum += i + length;
    }
    return sum;
}

And here’s the code after JIT:

ForEachWithoutLength()

;int sum = 0;
    xor  esi, esi
;int[] localRefToArray = this.array;
    mov  ecx, dword ptr [ecx+4]
;int i = 0;
    xor  edx, edx
;int arrayLength = localRefToArray.Length;
    mov  edi, dword ptr [ecx+4]
;if (arrayLength == 0) goto exit;
    test  edi, edi
    jle  exit
;int t = array[i];
  loop:
    mov  eax, dword ptr [ecx+edx*4+8]
;sum+=i;
    add  esi, eax
;i++;
    inc  edx
;if (i < arrayLength) goto loop
    cmp  edi, edx
    jg  loop
;return sum;
  exit:
    mov  eax, esi

ForEachWithLengthWithoutLocalVariable()

;int sum = 0;
    xor  esi, esi
;int[] localRefToArray = this.array;
    mov  ecx, dword ptr [ecx+4]
;int i = 0;
    xor  edx, edx
;int arrayLength = localRefToArray.Length;
    mov  edi, dword ptr [ecx+4]
;if (arrayLength == 0) goto exit
    test  edi, edi
    jle  exit
;int t = array[i];
  loop:
    mov  eax, dword ptr [ecx+edx*4+8]
;sum+=i;
    add  esi, eax
;sum+=localRefToArray.Length;
    add  esi, dword ptr [ecx+4]
;i++;
    inc  edx
;if (i < arrayLength) goto loop
    cmp  edi, edx
    jg  loop
;return sum;
  exit:
    mov  eax, esi

ForEachWithLengthWithLocalVariable()

;int sum = 0;
    xor  esi, esi
;int[] localRefToArray = this.array;
    mov  edx, dword ptr [ecx+4]
;int length = localRefToArray.Length;
    mov  ebx, dword ptr [edx+4]
;int i = 0;
    xor  ecx, ecx
;int arrayLength = localRefToArray.Length;
    mov  edi, dword ptr [edx+4]
;if (arrayLength == 0) goto exit;
    test  edi, edi
    jle  exit
;int t = array[i];
  loop:
    mov  eax, dword ptr [edx+ecx*4+8]
;sum+=i;
    add  esi, eax
;sum+=length ;
    add  esi, ebx
;i++;
    inc  ecx
;if (i < arrayLength) goto loop
    cmp  edi, ecx
    jg  loop
;return sum;
  exit:
    mov  eax, esi

The first thing that comes to mind is that it takes less assembler instructions than the for cycle (for example, for simple element summation it took 12 instructions in foreach, but 15 in for).

Comparison

Overall, here are results of for vs foreach benchmark for 1 million-element arrays:

sum+=array[i];

Method	ItemsCount	Mean	Error	StdDev	Median	Ratio	RatioSD
ForEach	1000000	1.401 ms	0.2691 ms	0.7935 ms	1.694 ms	1.00	0.00
For	1000000	1.586 ms	0.3204 ms	0.9447 ms	1.740 ms	1.23	0.65

And for

sum+=array[i] + array.Length;

Method	ItemsCount	Mean	Error	StdDev	Median	Ratio	RatioSD
ForEach	1000000	1.703 ms	0.3010 ms	0.8874 ms	1.726 ms	1.00	0.00
For	1000000	1.715 ms	0.2859 ms	0.8430 ms	1.956 ms	1.13	0.56

ForEach walks through the array a lot quicker than for. Why? To find out, we need to compare the code after JIT:

Comparison of all three foreach options

Let’s look at ForEachWithoutLength. The array length is requested only once and there aren’t any checks for the array boundaries. That happens because the ForEach cycle first restricts changing the collection inside the cycle, and second one won’t ever go outside the collection. Due to that, JIT can afford to remove the checks array boundaries.

Now let’s look carefully at ForEachWithLengthWIthoutLocalVariable. There’s only one strange part, where sum+=length happens not to previously saved local variable arrayLength, but to a new one that the app requests from memory each time. That means, there will be N+1 memory requests for array length, where N is an array length.

And now we come to ForEachWithLengthWithLocalVariable. The code there is exactly the same as in the previous example, except the handling of the array length. The compiler once again generated a local variable arrayLength that’s used to check if the array is empty, but the compiler still honestly saved our stated local variable length, and that’s what’s used in the summation inside the cycle. It turns out that this method requests the array length from memory only twice. The difference is very hard to notice in the real world.

In all cases, assembler code turned out so simple because the methods themselves are simple. If the methods had more parameters, it would have to work with the stack, variables might get stores outside of registers, there would’ve been more checks, but the main logic would remains the same: introducing a local variable for array length is only useful for making more readable code. It also turned out that Foreach often walks through the array faster than For.