Question

I'm writing a parser for the most common geographic data storage type, a collection of files called a "shapefile". This is my first project where I've had to think about endianness.

It turns out that the geometry storage is mixed endian; some parts of the file are big endian, but most of it is little endian. The shapefile standard is described here.

Is there a discernible performance rationale, or was it simply born out of historical context? If so, do you happen to know what that historical context is?

The integers and double-precision integers that make up the data description fields in the file header (identified below) and record contents in the main file are in little endian (PC or Intel®) byte order. The integers and double-precision floating point numbers that make up the rest of the file and file management are in big endian (Sun® or Motorola®) byte order.

Was it helpful?

Solution

While there doesn't seem to be a clear answer for it, what I've seen is a mixture of "confusion while trying to create a format that works on all platforms" and "a lot of poorly designed formats were designed back then". More info here: https://gis.stackexchange.com/questions/18969/oddities-in-the-shapefile-technical-specification

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top