They are all octet streams, i.e. binary, but the processing is different.
Besides the Reference Processing Model section also consider the Transforms element section for the following explanation.
1: Because http://fakefile.xml is not a same document reference and:
Unless the URI-Reference is such a 'same-document' reference, the result of dereferencing the URI-Reference MUST be an octet stream
Since there are no transforms, this octet-stream is the input for digest calculation
2: As stated in 1 http://fakefile.xml is not a same document reference, so the input for the transforms is an octet stream.
Since the canonicalization transform works over XML nodes, its input has to be converted to a XML node set, as stated on section Reference Processing Model:
If the data object is an octet stream and the next transform requires a node-set, the signature application MUST attempt to parse the octets yielding the required node-set via [XML] well-formed processing.
The output of the canonicalization transform is an octet stream by definition.
3: As stated in 1 http://fakefile.xml is not a same document reference, so the input for the transforms is an octet stream.
The XPath transform works over XML nodes which means the octet stream has to be converted to a node set (stated again in section XPath filtering). The output of the XPath transform is also a node set.
The following transform is canonicalization, which requires a XML node set as input. Since the inputs/outputs are chained (Transforms element section) and the previous output was already a node set, no conversion is needed.
Finally, the output of the canonicalization transform is an octet stream by definition.
In your examples the output of the transforms is always an octet stream, but if you have a single XPath transform, for example, then the result of transforms is a XML node set. It then has to be canonicalized as required by the ArchiveTimeStamp property definition. In this case you use the canonicalization algorithm defined on the ArchiveTimeStamp property itself, or the XML-DSIG default, if not specified.
Hope this helps.