Convert mystery bytea representation of Java int[][] back to array (Hibernate)

Question

Short version: Looks like Hibernate was saving a serialized Java object of type int[][]. Gross. Deserialize it in Java using a ByteArrayInputStream wrapped in an ObjectInputStream.

To get the raw bytes, a Python snippet like:

points_hex = open("/path/to/points.txt").readlines()
points = [ p[2:-1].strip().decode("hex") for p in points_hex ]

works. I suspected there was a common prefix, so I checked. Courtesty of this nice easy longest-common-prefix algo:

from itertools import takewhile,izip

def allsame(x):
    return len(set(x)) == 1

r = [i[0] for i in takewhile(allsame ,izip(*points))]
''.join(r).encode("hex")

it's confirmed to be:

\xaced0005757200035b5b4917f7e44f198f893c020000787000000004757200025b494dba602676eab2a50200007870000000020000

The presence of a prefix like this strongly suggests that we're dealing with a serialized Java object, not a bytea representation of a series of points in an array. That's easily confiremed with file, after writing a point's raw binary to a file:

open("/tmp/point","w").write(points[0])

then in the shell:

$ file /tmp/point 
/tmp/point: Java serialization data, version 5

You should decode these points by de-serializing them into an int[][] using Java. It'll be possible for a simple object like int[][] in SQL, but there's no point doing it manually when you can just ask Java to handle it for you.

Update:

I was feeling nice, so here's Java code to decode it:

import java.io.*;
import java.util.Arrays;

public class Deserialize {

    // Credit: https://stackoverflow.com/a/140861/398670
    public static byte[] hexStringToByteArray(String s) {
        int len = s.length();
        byte[] data = new byte[len / 2];
        for (int i = 0; i < len; i += 2) {
            data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4)
                                 + Character.digit(s.charAt(i+1), 16));
        }
        return data;
    }


    public static void main( String[] args ) throws Exception {
        if (args.length != 1) {
            System.err.println("Usage: java Deserialize aced....hexstring...");
            System.exit(1);
        }

        String hex = args[0];
        if (hex.startsWith("\\x")) {
            hex = hex.substring(2);
        }

        ByteArrayInputStream bis = new ByteArrayInputStream(hexStringToByteArray(hex));
        ObjectInput in = new ObjectInputStream(bis);
        Object obj_read = in.readObject();

        if (obj_read instanceof int[][]) {
            int[][] obj_arr = (int[][]) obj_read;
            System.err.println("Array contents are: " + Arrays.deepToString(obj_arr) );
        }
    }

}

Usage:

$ javac Deserialize.java 
$ java Deserialize '\xaced0005757200035b5b4917f7e44f198f893c020000787000000004757200025b494dba602676eab2a50200007870000000020000021a000002f87571007e000200000002000002a4000002f87571007e000200000002000002a40000037a7571007e0002000000020000021a0000037a'
Array contents are: [[538, 760], [676, 760], [676, 890], [538, 890]]

Of course, in reality you'll be using PgJDBC, which hands you a byte[] directly, so you don't have to do the hex decoding I do above.

Convert mystery bytea representation of Java int[][] back to array (Hibernate)

Context