Domanda

In one of the mapreduce program, I was using new Text() during context.write.

context.write(key, new Text(outputRecord.toDelimitedString("|")));

As I am using above statement, I want to know how Text objects are stored and how memory management is being handled. Also want to know about existence of a object value after not referred by any object.

Please let me know about this.

È stato utile?

Soluzione

No Text in Hadoop is not immutable. It can't be, because Hadoop's serialization process implicitly forbids immutability.

In this particular case, context.write will just serialize the content of Text into a byte buffer directly inside the call, so the Text object will be trashed soon after the method returns.

Keep in mind that there is still a stack reference to the Text object as it is passed into the method, so it won't be eligible for garbage collection.

Altri suggerimenti

All of your questions will be answered if you checked Hadoop Text source code.

This class stores text using standard UTF8 encoding. It provides methods to serialize, deserialize, and compare texts at byte level. The type of length is integer and is serialized using zero-compressed format.

In addition, it provides methods for string traversal without converting the byte array to a string.

Also includes utilities for serializing/deserialing a string, coding/decoding a string, checking if a byte array contains valid UTF8 code, calculating the length of an encoded string.

The class isn't immutable as you see from the source code.

Regarding your question:

Also want to know about existence of a object value after not referred by any object

You need to read about the JVM memory model.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top