Stax vs Sax vs DOM parser? [duplicate]

https://stackoverflow.com/questions/23615771

21-07-2023
|

Pergunta

i went thru wiki link. Got some questions based on it.

As per my understanding we should go for one of these parser based on below criteria

DOM Parser :- File is small and need to traverse in both direction i.e backward/forward

SAX Parser :- Go for this when you don't have requirement to move in backward direction as file small or large does not matter becoz its always better in terms of performance.

Is this correct?

I have heard recently about STAX and went thru wiki link. It says

StAX was designed as a median between these two opposites(DOM ans SAX).

With this i got impression we can move in backward/forward with STAX but googling says we can move only in forward direction with STAX. So how come stax offers the advantage of DOM ?

Link also says

The application moves the cursor forward - 'pulling' the information from the parser as it needs. This is different from an event based API - such as SAX - which 'pushes' data to the application - requiring the application to maintain state between events as necessary to keep track of location within the document

So STAX uses pull approach while sax uses push approach but how does it matter to developer whether its pull or push until and unless its good in performance or require less effort?

Solução

The push model works well for non-blocking channels and pull model works well for blocking streams.

Say you have an event loop, reading from a non-blocking socket e.g. NIO and you have a piece of data. For this model you want push, so you can push the data you have and move onto some other work.

For a pull model, your parser tells you when you need to read more data. This makes sharing that thread with another one more difficult. The common solution for NIO being, to read the entire unparsed document into memory first and then passing it to the pull parser. Obviously this doesn't work so well for low latency or large documents.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow