質問

I am writing an application that needs to unmarshall a huge XML file using castor. Because of this reason, I need use a streaming XML parser such as Stax to parse the XML file. According to Castor's documentation, castor default parser is Xerces. I visited Xerces home page, and I could not find any information whether Xerces is a streaming parser or not.

Does anyone know whether Xerces is a streaming parser. Thank you.

役に立ちましたか?

解決 2

From http://en.wikipedia.org/wiki/Xerces:

Xerces is Apache's collection of software libraries for parsing, validating, serializing and manipulating XML. The library implements a number of standard APIs for XML parsing, including DOM, SAX and SAX2

So it seems to support streaming and non streaming APIs. See http://xerces.apache.org/#xerces2-j for all supported APIs.

他のヒント

There is some advice on the FAQ on how to handle this situation. Quoting the docs.

How do I read data from a stream as it arrives?

There are 3 problems you have to deal with:

  • The Apache parsers read the entire data stream into a buffer before they start parsing; you need to change this behaviour, so that they analyse "on the fly"
  • The Apache parsers terminate when they reach end-of-file; with a data stream, unless the sender drops the socket, you have no end-of-file, so you need to terminate in some other way
  • The Apache parsers close the input stream on termination, and this closes the socket; you normally don't want this, because you'll want to send an ack to the data stream source, and you may want to have further exchanges on the socket anyway.
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top