Is there a performance rationale for mixing a binary file's endianness?
-
27-10-2019 - |
Question
I'm writing a parser for the most common geographic data storage type, a collection of files called a "shapefile". This is my first project where I've had to think about endianness.
It turns out that the geometry storage is mixed endian; some parts of the file are big endian, but most of it is little endian. The shapefile standard is described here.
Is there a discernible performance rationale, or was it simply born out of historical context? If so, do you happen to know what that historical context is?
The integers and double-precision integers that make up the data description fields in the file header (identified below) and record contents in the main file are in little endian (PC or Intel®) byte order. The integers and double-precision floating point numbers that make up the rest of the file and file management are in big endian (Sun® or Motorola®) byte order.
Solution
While there doesn't seem to be a clear answer for it, what I've seen is a mixture of "confusion while trying to create a format that works on all platforms" and "a lot of poorly designed formats were designed back then". More info here: https://gis.stackexchange.com/questions/18969/oddities-in-the-shapefile-technical-specification