Why do we prefer byte[] against int[] or long[]?

Question 1

Working with a byte array is practically (ignoring computers that cannot work with 8-bit chunks of data natively; I don't even know if such exist in actual use these days) guaranteed to always represent the data bytes in the same order, regardless of platform, programming language or framework. Given knowledge of the storage or transmission format, you can translate it to whatever internal format your current platform etc. uses.

For example, I wouldn't trust that an application written in C++ running on an Alpha CPU will write out an unsigned long in the same way that a .NET application running on Intel writes out a UInt32 (let alone how perhaps Java running on an IBM z10 might handle the lower 32 bits of a 64-bit long or PIC assembly might handle tossing a 32-bit value at an I/O port). If you work with pure bytes, this becomes a non-issue: you will have to translate the byte sequence wherever you read or write it, but you will know exactly how to do that. It is well defined.

If you send data over a socket, persist it to a file, or otherwise transmit it in space or time, by using a byte array you guarantee that the recipient will see exactly what was sent or persisted. It is then up to the recipient (note that the "recipient" may be your own application's file "load" code, whereas the "sender" may be the code to "save" to a file) to do something useful with the byte sequence that the sender generated from whatever happens to be its native format.

If you are using non-byte types, you need to guarantee the byte order by other means, because depending on platform etc. the bytes may be interpreted in a different order. For example, you would need to specify (either yourself or by reference to the framework's specification) whether the persisted form of a multi-byte integer uses big endian or little endian.

Question 2

Streams like files and sockets are modelled as bytes i.e. byte[]. There are some file formats which are actually 16-bit values or 32-bit value etc but these are natives just bytes.

Question 3

If you had a 100MB file and read it into an array of int you would need 400MB of memory (if you read one byte into each element - you could pack 4 bytes into one int but it would be very difficult to work with individual bytes that way). So outright memory efficiency is one reason I'd say, on top of the reason that bytes are the fundamental smallest addressable unit of memory of almost all computer systems today.

Question 4

byte is the unit of measure of the size of binary transfer. If you do not use byte, then, for example, yo can not reliably send a 1 byte message, read a 3 byte file, etc.

Another factor is protocols like utf8, where data sequences are not aligned on a fixed-size byte boundary.

Question 5

COMMON USAGE AMONG

Many types of applications use information representable in eight or fewer bits and processor designers optimize for this common usage. The popularity of major commercial computing architectures has aided in the ubiquitous acceptance of the 8-bit size.

Quoted from http://en.wikipedia.org/wiki/Byte