I suspect the biggest problem is down to the way you are inserting data into the database with Hibernate.
When you either call EntityManager.persist()
or EntityManager.merge()
, the entity you are working with is added to the PersistenceContext of your EntityManager
instance (it is worth getting your head around entity lifecycles as described here.)
You can think of the PersistenceContext as a kind of cache that Hibernate works with to avoid unnecessary trips to the database for objects that it has already loaded within the current unit of work. In addition Hibernate uses the PersistenceContext to perform dirty checking so that it understands which objects need to be flushed when the transaction commits.
This is fine with small number of objects. The problem comes when you're working with a very large number of objects, as Hibernate keeps a reference to each and every object in the PersistenceContext for the reasons explained above.
Therefore it is important that when you're doing large batch inserts, you carefully manage the size of the PersistenceContext, either be explicitly flushing and clearing it at certain intervals, or by using a stateless EntityManager
for the bulk insertions.
Hibernate has a good explanation of how to work with "a lot" of entities in one go here. I suspect that following that advice will solve most of your memory problems.