Why does array initialization always resort to int?

Question 1

Why are datatypes used that are larger than needed?

The number of line-of-business applications where you're doing a calculation in integers and can guarantee that the result will fit into a byte or short are vanishingly small. The number of line-of-business applications where the result of an integer calculation fits into an int is enormous.

Why does the specification have this rule for literals?

Because it is a perfectly sensible rule. It is consistent, clear and understandable. It makes a good compromise between many language goals such as reasonable performance, interoperability with existing unmanaged code, familiarity to users of other languages, and treating numbers as numbers rather than as bit patterns. The vast majority of C# programs use numbers as numbers.

What are the advantages since the huge downside is the away from future (SIMD) optimizations.

I assure you that not one C# programmer in a thousand would list "difficulty of taking advantage of SIMD optimizations" as a "huge downside" of C#'s array type inference semantics. You may in fact be the only one. It certainly would not have occurred to me. If you're the kind of person who cares so much about it then make the type manifest in the array initializer.

C# was not designed to wring every last ounce of performance out of machines that might be invented in the future, and particularly was not designed to do so when type inference is involved. It was designed to increase productivity of line-of-business developers, and line-of-business developers don't think of columnWidths = new [] { 10, 20, 30 }; as being an array of bytes.

Question 2

C# 5.0 spec 2.4.4.2

• If the literal has no suffix, it has the first of these types in which its value can be represented: int, uint, long, ulong.

• If the literal is suffixed by U or u, it has the first of these types in which its value can be represented: uint, ulong.

• If the literal is suffixed by L or l, it has the first of these types in which its value can be represented: long, ulong.

• If the literal is suffixed by UL, Ul, uL, ul, LU, Lu, lU, or lu, it is of type ulong.

All of your examples hit the first in that list... int.

All integral literals follow this rule. Which is why var i = 10; is inferred as int too.

Question 3

When you put integer value without any suffix like 30, 130, 230 you declare int32 value; so

new[] { 30, 130, 230 }; // <- array of int's

and if you want array of byts you have to put it explicitly:

  new byte[] { 30, 130, 230 }; // <- treat each value as byte

Question 4

The literals you use as examples all have have System.Int32, while the values could be stored without loss in narrowed integral types (eg. System.Int16) the syntax says System.Int32.

As all the specified members of each array are System.Int32, the array has type System.Int32[].

Of course it would be possible to define a language where integral literals (without other indication such as suffixes) have type "the smallest integral type sufficient to hold the value" that language is not C#.

In the latest – V5.0 – C# Language specification (from my VS2013 installation), in section 2.4.4.2:

Integer literals are used to write values of types int, uint, long, and ulong.

Ie. there is no way to write a byte, sbyte, short, or unsigned short literal without a cast.

Question 5

I believe that operations will always be faster running in the native bit-size, so int for 32-bit machines, hence the convention.

This also implies that for running 64-bit applications, int64 would be better used than int for arrays.