Implementing a Number System in Java: Mutable vs. Immutable

https://stackoverflow.com/questions/19573242

01-07-2022
|

Question

I am implementing classes for rational numbers, but the problems and issues are essentially the same for complex numbers as well as other classes intended to be used in applications with a significant number of calculations performed on a given mathematical object.

In the libraries distributed with the JRE and in many third-party libraries, number classes are immutable. This has the advantage that "equals" and "hashcode" can be reliably implemented together as intended. This will enable instances to be used as both keys and values in various collections. In fact, immutability of an instance throughout its lifetime as a key value in a collection must be maintained for reliable operations on the collection. This is much more robustly maintained if the class prevents operations which may alter the internal state on which the hashcode method relies once an instance is created than if it is left up to the developer and subsequent maintainers of the code to abide by the convention of removing instances from collections before modifying their state followed by adding the instances back to whichever collections they must belong.

Yet, if the class design enforces -- within the limits of the language -- immutability, mathematical expressions become burdened with excessive object allocation and subsequent garbage collection when performing even simple mathematical operations. Consider the following as an explicit example of what occurs repeatedly in complex computations:

Rational result = new Rational( 13L, 989L ).divide( new Rational( -250L, 768L ) );

The expression includes three allocations -- two of which are quickly discarded. To avoid some of the overhead, classes typically preallocate commonly used "constants" and may even maintain a hash table of frequently used "numbers." Of course, such a hash table would likely be less performant than simply allocating all of the necessary immutable objects and relying on the Java compiler and JVM to manage the heap as efficiently as possible.

The alternative is to create classes which support mutable instances. By implementing the methods of the classes in a fluent style, it is possible to evaluate concise expressions functionally similar to the above without allocating a third object to be returned from the "divide" method as the "result." Again, this is not particularly significant for this one expression. However, solving complex linear algebra problems by operating on matrices is a more realistic case for mathematical objects which are better processed as mutable objects rather than having to operate on immutable instances. And for matrices of rational numbers, a mutable rational number class would seem to be much more easily justified.

With all that, I have two related questions:

Is there anything about the Sun/Oracle Java compiler, JIT, or JVM which would conclusively recommend immutable rational or complex number classes over mutable classes?
If not, how should "hashcode" be handled when implementing mutable classes? I am inclined to "fail-fast" by throwing an unsupported operation exception rather than providing either an implementation prone to misuse and unnecessary debugging sessions or one which is robust even when the state of immutable objects change, but which essentially turns hash tables into linked lists.

Test Code:

For those wondering whether immutable numbers matter when performing calculations roughly similar to those I need to implement:

import java.util.Arrays;

public class MutableOrImmutable
{
    private int[] pseudomatrix = { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                                   0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
                                   0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
                                   0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
                                   0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
                                   0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
                                   0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
                                   0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
                                   0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
                                   0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
                                   1, 2, 0, 0, 0, 0, 0, 0, 0, 0,
                                   0, 0, 3, 4, 0, 0, 0, 0, 0, 0,
                                   0, 0, 0, 0, 5, 5, 0, 0, 0, 0,
                                   0, 0, 0, 0, 0, 0, 4, 3, 0, 0,
                                   0, 0, 0, 0, 0, 0, 0, 0, 2, 1 };

    private int[] scalars = { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };

    private static final int ITERATIONS = 500;

    private void testMutablePrimitives()
    {
        int[] matrix = Arrays.copyOf( pseudomatrix, pseudomatrix.length );

        long startTime = System.currentTimeMillis();

        for ( int iteration = 0 ; iteration < ITERATIONS ; ++iteration )
        {
            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] *= scalar;
                }
            }

            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] /= scalar;
                }
            }
        }

        long stopTime    = System.currentTimeMillis();
        long elapsedTime = stopTime - startTime;

        System.out.println( "Elapsed time for mutable primitives: " + elapsedTime );

        assert Arrays.equals( matrix, pseudomatrix ) : "The matrices are not equal.";
    }

    private void testImmutableIntegers()
    {
        // Integers are autoboxed and autounboxed within this method.

        Integer[] matrix = new Integer[ pseudomatrix.length ];

        for ( int index = 0 ; index < pseudomatrix.length ; ++index )
        {
            matrix[ index ] = pseudomatrix[ index ];
        }

        long startTime = System.currentTimeMillis();

        for ( int iteration = 0 ; iteration < ITERATIONS ; ++iteration )
        {
            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] = matrix[ index ] * scalar;
                }
            }

            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] = matrix[ index ] / scalar;
                }
            }
        }

        long stopTime    = System.currentTimeMillis();
        long elapsedTime = stopTime - startTime;

        System.out.println( "Elapsed time for immutable integers: " + elapsedTime );

        for ( int index = 0 ; index < matrix.length ; ++index )
        {
            if ( matrix[ index ] != pseudomatrix[ index ] )
            {
                // When properly implemented, this message should never be printed.

                System.out.println( "The matrices are not equal." );

                break;
            }
        }
    }

    private static class PseudoRational
    {
        private int value;

        public PseudoRational( int value )
        {
            this.value = value;
        }

        public PseudoRational multiply( PseudoRational that )
        {
            return new PseudoRational( this.value * that.value );
        }

        public PseudoRational divide( PseudoRational that )
        {
            return new PseudoRational( this.value / that.value );
        }
    }

    private void testImmutablePseudoRationals()
    {
        PseudoRational[] matrix = new PseudoRational[ pseudomatrix.length ];

        for ( int index = 0 ; index < pseudomatrix.length ; ++index )
        {
            matrix[ index ] = new PseudoRational( pseudomatrix[ index ] );
        }

        long startTime = System.currentTimeMillis();

        for ( int iteration = 0 ; iteration < ITERATIONS ; ++iteration )
        {
            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] = matrix[ index ].multiply( new PseudoRational( scalar ) );
                }
            }

            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] = matrix[ index ].divide( new PseudoRational( scalar ) );
                }
            }
        }

        long stopTime    = System.currentTimeMillis();
        long elapsedTime = stopTime - startTime;

        System.out.println( "Elapsed time for immutable pseudo-rational numbers: " + elapsedTime );

        for ( int index = 0 ; index < matrix.length ; ++index )
        {
            if ( matrix[ index ].value != pseudomatrix[ index ] )
            {
                // When properly implemented, this message should never be printed.

                System.out.println( "The matrices are not equal." );

                break;
            }
        }
    }

    private static class PseudoRationalVariable
    {
        private int value;

        public PseudoRationalVariable( int value )
        {
            this.value = value;
        }

        public void multiply( PseudoRationalVariable that )
        {
            this.value *= that.value;
        }

        public void divide( PseudoRationalVariable that )
        {
            this.value /= that.value;
        }
    }

    private void testMutablePseudoRationalVariables()
    {
        PseudoRationalVariable[] matrix = new PseudoRationalVariable[ pseudomatrix.length ];

        for ( int index = 0 ; index < pseudomatrix.length ; ++index )
        {
            matrix[ index ] = new PseudoRationalVariable( pseudomatrix[ index ] );
        }

        long startTime = System.currentTimeMillis();

        for ( int iteration = 0 ; iteration < ITERATIONS ; ++iteration )
        {
            for ( int scalar : scalars )
            {
                for ( PseudoRationalVariable variable : matrix )
                {
                    variable.multiply( new PseudoRationalVariable( scalar ) );
                }
            }

            for ( int scalar : scalars )
            {
                for ( PseudoRationalVariable variable : matrix )
                {
                    variable.divide( new PseudoRationalVariable( scalar ) );
                }
            }
        }

        long stopTime    = System.currentTimeMillis();
        long elapsedTime = stopTime - startTime;

        System.out.println( "Elapsed time for mutable pseudo-rational variables: " + elapsedTime );

        for ( int index = 0 ; index < matrix.length ; ++index )
        {
            if ( matrix[ index ].value != pseudomatrix[ index ] )
            {
                // When properly implemented, this message should never be printed.

                System.out.println( "The matrices are not equal." );

                break;
            }
        }
    }

    public static void main( String [ ] args )
    {
        MutableOrImmutable object = new MutableOrImmutable();

        object.testMutablePrimitives();
        object.testImmutableIntegers();
        object.testImmutablePseudoRationals();
        object.testMutablePseudoRationalVariables();
    }
}

Footnote:

The core problem with mutable vs. immutable classes is the -- highly questionable --"hashcode" method on Object:

The general contract of hashCode is:

Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.

If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

But once an object is added to a collection dependent on the value of its hash code derived from its internal state used to determine "equality," it is no longer properly hashed into the collection when it's state changes. Yes, the burden is on the programmer to ensure that mutable objects are not improperly stored in collections, but the burden is even greater on the maintenance programmer unless improper use of a mutable class is not prevented in the first place. This is why I believe the right "answer" for "hashcode" on mutable objects is to always throw an UnsupportedOperationException while still implementing "equals" to determine object equality -- think of matrices which you want to compare for equality, but would never think to add to a Set. However, there may be an argument that throwing an exception is a violation of the above "contract" with dire consequences of its own. In that case, hashing all instances of a mutable class to the same value may be the "correct" way to maintain the contract despite the very poor nature of the implementation. Is returning a constant value -- perhaps generated from hashing the class name -- recommended over throwing an exception?

Solution 3

Currently, I am implementing the rational numbers with immutable objects. This allows heavy reuse of ZERO and ONE objects which occur frequently in the computations I need to perform. However, the Matrix class which is implemented with rational number elements is mutable -- going so far as to use null internally for "virtual" zeros. With a more pressing need to handle "small" rational numbers and arbitrary-precision "big" rational numbers seamlessly, the immutable implementation is acceptable for now until I have time to profile the library of problems I have available for that purpose so as to determine whether mutable objects or a larger set of "common" immutable objects will win the day.

Of course, if I end up needing to implement "equals" to test Matrix equality, I will be back to the same issue with Matrix "hashcode" when the possible need for the method is highly unlikely. Which takes me back again to the rather useless complaint that "hashcode" -- and probably "equals" as well -- should never have been part of the java.lang.Object contract in the first place...

OTHER TIPS

You have written: "mathematical expressions become burdened with excessive object allocation and subsequent garbage collection when performing even simple mathematical operations." and "The expression includes three allocations -- two of which are quickly discarded".

Modern Garbage collectors are actually optimised for this pattern of allocation, so your (implicit) assumption that allocation and subsequent garbage collection is expensive, is wrong.

For example, see this white-paper: http://www.oracle.com/technetwork/java/whitepaper-135217.html#garbage Under "Generational Copying Collection", it states:

"... First, because new objects are allocated contiguously in stack-like fashion in the object nursery, allocation becomes extremely fast, since it merely involves updating a single pointer and performing a single check for nursery overflow. Secondly, by the time the nursery overflows, most of the objects in the nursery are already dead, allowing the garbage collector to simply move the few surviving objects elsewhere, and avoid doing any reclamation work for dead objects in the nursery."

Thus I would suggest that the answer to your real question is that you should use Immutable objects, because the perceived costs are not really costs at all, but the perceived benefits (eg simplicity, code readability) are real benefits.

One pattern which can be useful is to define an abstract type or interface for a "readable" thing, and then have both mutable and immutable forms of it. This pattern can be particularly nice if the base or interface types includes AsMutable, AsNewMutable, and AsImmutable methods which can be overridden in suitable fashion in a derived object. Such an approach allows one to achieve the benefits of mutability when they are desired, and yet also receive the benefits of using an immutable type. Code which wants to persist a value but not mutate it would have to use "defensive copying" if it worked with a mutable type, but could instead use AsImmutable if it receives a "readable" thing. If the thing happens to be mutable, it would make a copy, but if it's immutable, no copy would be necessary.

Incidentally, if one is designing an immutable type with relatively few fields other than a reference to a large object that holds the acutal data, and if things of the type will frequently be compared for equality, it may be helpful to have each type hold a unique sequence number as well as a reference to the oldest instance, if any, to which it is known to be equal (or null if no older instance is known to exist). When comparing two instances for equality, determine the oldest instance which is known to matches each (recursively check the oldest known instance until it is null). If both instances are known to match the same instance, they are equal. If not, but they turn out to be equal, then whichever "older instance" was younger should regard the other as an older instance to which it is equal. The approach will yield accelerated comparisons much as interning would, but without the use of a separate interning dictionary, and without having to hash values.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow