Using MongoDB data inside Hadoop with the help of Morphia
-
03-07-2021 - |
Frage
I've been playing with the MongoInputFormat that allows having all documents in a MongoDB collection put through a MapReduce job written in Hadoop.
As you can see in the provided examples (this, this and this) the type the document is in that is provided to the mapper is a BSONObject (an Interface in Java).
Now I also like Morphia very much which allows mapping the raw data from MongoDB into POJOs that are much easier to use.
Since I can only get a BSONObject as input I thought about using the method described at the bottom of this page of the Morphia wiki:
BlogEntry blogEntry = morphia.fromDBObject(BlogEntry.class, blogEntryDbObj);
My problem is that this method requires a DBObject instead of a BSONObject. A DBObject is really:
public interface DBObject extends BSONObject
So as you can see I cannot simply cast from BSONObject to DBObject and call the provided method.
How do I handle this the best way?
Lösung
You'll notice that the BSONObject
and DBObject
interfaces are very similar. Just because a converson doesn't exist doesn't mean it's not easy to create a trivial one:
class BSONDBObject extends BasicBSONObject implements DBObject {
boolean ispartial = false;
public BSONDBObject(BSONObject source) {
this.putAll(source);
}
public boolean isPartialObject() {
return ispartial;
}
public void markAsPartialObject() {
this.ispartial = true;
}
}
Now, you just need to
BSONObject bson; // Filled by the MongoInputFormat
BSONBDObject dbo = BSONDBObject(bson);
EntityCache = entityCache = new DefaultEntityCache();
BlogEntry blogEntry = morphia().fromDBObject(BlogEntry.class, dbo, entityCache);
Andere Tipps
I found a solution that works for me:
- First make it into a JSON text string
- Parse this into a DBObject
- Map that using Morphia to a useful instance.
Effectively I now have something like this:
BSONObject bson; // Filled by the MongoInputFormat
EntityCache = entityCache = new DefaultEntityCache();
String json = JSON.serialize(bson)
DBObject dbObject = (DBObject) JSON.parse(json);
BlogEntry blogEntry = morphia().fromDBObject(BlogEntry.class, dbObject, entityCache);