Question

I have got big XML file of size 2GB to 4 GB which i am getting through third pasrty vendor. Now i need to do two type of operation for different usescases :-

Use-case 1 :-

Validate some of the values in XML.

Use-case 2 :-

Marshal the XML to java object for some processing.

Now my question is whats the best way of doing it. Can i validate specific element without parsing for Usecase 1.

Second question is how can I marshal such big file without getting OutOfMemory exception or with best utilization of memory for use-case 2?

Was it helpful?

Solution

If the document is that large, you will want to use an event based parser such as SAX and marshall objects by hand. This will allow you to process the document as a stream rather than reading in the whole document at once.

Here's a brief article discussing event based parsing versus tree-based parsing:

http://www.saxproject.org/event.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top