Question

I have asked a very resemblant question in a previous thread Hadoop: How can i have an array of doubles as a value in a key-value pair? .

My problem is that i want to pass a double array as value from map to reduce phase. The answer i got was to serialize, convert to Text, pass it to the reducer and deserialize. This is a fine solution but its like serializing and deserializing it twice.

ArrayWritable only accepts types that implement Writable like FloatWritable for example. So another solution is to convert my array of doubles to an array of DoubleWritables. But this requires some time too and Writables are a very expensive resource. Isn't there a very simple solution like ArrayWritable array=new ArrayWritable(Double.class) ???

Was it helpful?

Solution

Just implement your own Writable interface.

For example,

public class DoubleArrayWritable implements Writable {
    private double[] data;

    public DoubleArrayWritable() {

    }

    public DoubleArrayWritable(double[] data) {
        this.data = data;
    }

    public double[] getData() {
        return data;
    }

    public void setData(double[] data) {
        this.data = data;
    }

    public void write(DataOutput out) throws IOException {
        int length = 0;
        if(data != null) {
            length = data.length;
        }

        out.writeInt(length);

        for(int i = 0; i < length; i++) {
            out.writeDouble(data[i]);
        }
    }

    public void readFields(DataInput in) throws IOException {
        int length = in.readInt();

        data = new double[length];

        for(int i = 0; i < length; i++) {
            data[i] = in.readDouble();
        }
    }
}

OTHER TIPS

You can specify double[] as the value type for a Map:

Map<String, double[]> map = new HashMap<String, double[]>(); // compiles

Java arrays are automatically Serializable if the element type is Serializable, and primitives are all Serializable.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top