Decide between struct and class to represent a hexahedron in a scenario where there will be millions of them in memory [closed]

StackOverflow https://stackoverflow.com/questions/22148107

Question

I have a pretty non trivial design problem on .net 4.5. I have a grid that is supposed to have millions of hexahedrons. Each hexahedron has 8 points and 6 quadrilateral faces. Each quadrilateral face may be planar or curved. If it is planar, it is represented by a (class|struct) called Plane, that has 4 doubles for the plane equation and has the 4 vertices of the quadrilateral. If the face is Curved, it is represented by a single point and a 3x3 matrix.

The main concern here is performance, garbage collection, assuming a memory limit of 2gb for any array of blocks. The question is: we have Block, Point, Face, Plane, Curve, Matrix3x3. Which of them should be class and which of them should be struct?

No correct solution

OTHER TIPS

(Ignoring P/Invoke aspects, which is a different matter)

As a very general rule of thumb, you should only make types with small amounts of data (say 32 bytes) into structs.

Note that structs should ideally be immutable.

In terms of speed: It depends what you're doing, so you would have to perform some timings to really tell. However, it's likely that when you are passing items to a method it will be quicker to pass a reference type rather than a struct type when the struct size is greater than the reference size (which will be 32 bits for 32 bit code and 64 bits for 64 bit code).

One very important thing to bear in mind when creating arrays or List: For value types, the size of the value in bytes times the number of elements is the total contiguous size of the underlying array.

For reference types, the total size is the size of a reference (32 bits or 64 bits) times the size of the array.

Since the maximum size of an array is 2^31 bytes, this can be important if the size of the value type exceeds the size of a reference.

Updated

Let's assume we have a Plane type with 4 doubles on 64bit system, and that we have 1 million Planes.

  • If SPlane is a struct, then it would occupy 4*8 = 32 bytes
  • If CPlane is a class, then it will occupy 4*8 bytes (fields) + 16 bytes (header) = 48 bytes. And don't forget about the 8 bytes for each reference pointing to each instance.

Let's consider the pros and cons of using a struct, instead of a class.

Pros

  • GC Performance - there are less references to keep track of. The GC will treat an array of SPlane an one object. An array of CPlane would have to be treated as 1 million + 1 objects.
  • Memory space - an array of SPlane will occupy 32 million bytes in memory. An array of CPlane will occupy 8 million bytes (array of contiguous references) + 48 million bytes (individual instances). Thats 32 million bytes vs 56 million bytes.

Cons

  • Performance degradation due to copying
    • "resizing"/"expanding" an array of struct planes would copy 32 million bytes, whereas if we were using classes, it would copy the references only (8 million bytes)
    • likewise, things like passing a SPlane as an argument to a method, returning an SPlace or assigning a variable to another variable, will copy 32 bytes, whereas passing CPlane will copy just the reference.
  • Usual caveats of using value types.
    • no inheritance (doesn't seem to matter to you anyway)
    • acidental boxing (implicit casting to object or interface, calling GetType or non-overriden ToString). This one can be mitigated by being careful.
    • no canonical form (no explicit parameterless constructor - you can't restrain the default value of a value type field). E.g., unassigned fields and arrays would, by default, be filled with persons of 0 height - considering struct Person {int height;}.
    • no circular dependencies. A struct Person cannot contain fields of type Person, as that would lead to an infinite memory layout.

Since we don't know your exact use cases, you'll have to make the decision.

Like Matthew Watson suggested, you should measure the performance of both approaches and compare.

Suppose one wants to hold one million 3d points, and allow the following operations:

  • double GetX(int index) // And likewise GetY and GetZ
  • double SetX(int index, double value) // And likewise SetY and SetZ
  • double SetXYZ(int index, double x, double y, double z)
  • void CopyCoord(int src, int dest)

If one uses a mutable structure type:

struct Point3dStruct { public double X,Y,Z; }
Point3dStruct[] array;

the operations would become:

void init()
{
  array = new Point3dStruct[1000000];
}

double GetX(int index)
{ return array[index].X; }

double SetX(int index, double value)
{ array[index].X = value; }

double SetXYZ(int index, double x, double y, double z)
{ array[index].X = x; array[index].Y = y; array[index].Y = z; }

void CopyCoord(int src, int dest)
{ array[dest] = array[src]; }

All operations would be reasonably efficient; 24,000,000 bytes would be required to hold 1,000,000 points, regardless of whether some or all of them were the same or different.

Using a so-called "immutable" struct would require changing the SetX and SetXYZ methods:

double SetX(int index, double value)
{
  Point3dStruct temp = array[index];
  array[index] = new Point3dStruct(value, temp.Y, temp.Z);
}

double SetXYZ(int index, double x, double y, double z)
{
  array[index] = new Point3dStruct(x, y, z);
}

Performance for SetX would be much inferior to that of a simple exposed-field struct; no method would perform better than the exposed-field-struct equivalent. Memory requirements would not be affected by whether the struct was mutable or not.

A mutable class would require code much like a mutable struct except for the init and CopyCoord methods.

void init()
{
  array = new Point3dClass[1000000];
  for (int i=0; i<1000000; i++)
    array[i] = new Point3dClass();
}

void CopyCoord(int src, int dest)
{
   array[dest].X = array[src].X;
   array[dest].Y = array[src].Y;
   array[dest].Z = array[src].Z;
}

Note that accidentally writing array[dest] = array[src] would not copy the values, but would instead totally break the code! Memory usage would require an extra 16 or 32 bytes per element on 32-bit or 64-bit machines (i.e. 16,000,000 or 32,000,000 bytes) regardless of whether all points held the same or different values.

Use of an immutable class would require code similar to an immutable struct, except for the init method:

void init()
{
  array = new Point3dClass[1000000];
  var zero = new Point3dClass(0.0, 0.0, 0.0);
  for (int i=0; i<1000000; i++)
    array[i] = zero;
}

Initial memory usage would only be about 4,000,000 or 8,000,000 bytes (on 32- or 64-bit machines, respectively), but every separately-created instance of Point3dClass would add another 12 or 24 bytes. If the array holds references to 1,000,000 different instances of Point3dClass, those would total up to another 12,000,000 or 24,000,000 bytes.

If code will be using methods analogous to CopyCoord more often than it will be using methods analogous to SetX, then an immutable class can be a big win. If it will be using SetX a lot, an exposed-field mutable struct will offer the best performance. Mutable class types may play nicer than mutable structs when stored in collections other than arrays, but they have a substantial performance overhead and must be used with extreme care. The only advantage of immutable structs is that code which is written for an immutable struct can often be changed easily to use an immutable class instead.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top