سؤال

Java: 1.6
Woodstox: 4.1.4

I just want to skip part of xml file, while parsing. Let's look at that simple xml:

<family>
    <mom>
        <data height="160"/>
    </mom>
    <dad>
        <data height="175"/>
    </dad>
</family>

I just want do skip dad element. So it look's like using skipElement method like shown below is a good idea:

FileInputStream fis = ...;
XMLStreamReader2 xmlsr = (XMLStreamReader2) xmlif.createXMLStreamReader(fis);

String currentElementName = null;
while(xmlsr.hasNext()){

    int eventType = xmlsr.next();

    switch(eventType){

        case (XMLEvent2.START_ELEMENT):
            currentElementName = xmlsr.getName().toString();

            if("dad".equals(currentElementName) == true){
                logger.info("isStartElement: " + xmlsr.isStartElement());
                logger.info("Element BEGIN: " + currentElementName);
                xmlsr.skipElement();
            }

                    ...
    }
}

We just find start of element dad, and skip it. But not so fast, becouse Exception will be thrown. This is the output:

isStartElement: true
Element BEGIN: dad
Exception in thread "main" java.lang.IllegalStateException: Current state not START_ELEMENT

That is not what expected. This is indeed very unexpected, becouse method skipElement is executed in START_ELEMENT state. I don't know what is going on, maybe you know more :). So please help me.

thanks in advance
Hubert

هل كانت مفيدة؟

المحلول 3

I've found the reason, why I was getting the IllegalStateException. The very useful was flup's answer. Thanks a lot.
It is worth to read answer given by Blaise too.

But getting to the heart of the matter. The problem was not skipElement() method itself. The problem was caused becouse of methods used to read attributes. There are three dots (...) in my question. So let's look what was there:

switch(eventType){

case (XMLEvent2.START_ELEMENT):
    currentElementName = xmlsr.getName().toString();
    logger.info("currentElementName: " + currentElementName);


    if("dad".equals(currentElementName) == true){
        logger.info("isStartElement: " + xmlsr.isStartElement());
        logger.info("Element BEGIN: " + currentElementName);
        xmlsr.skipElement();
    }


    case (XMLEvent2.ATTRIBUTE):
        int attributeCount = xmlsr.getAttributeCount(); 
        ...
        break;


}

Important thing. There is no break statement for START_ELEMENT. So every time START_ELEMENT event occurs the code for event ATTRIBUTE is also executed. That looks OK according to Java Docs, becouse methods getAttributeCount(), getAttributeValue() etc. can be executed for both START_ELEMENT and ATTRIBUTE.

But after calling method skipElement(), event START_ELEMENT is changed to END_ELEMENT. So calling method getAttributeCount() is not allowed. This call is the reason why IllegalStateException is thrown.

The simplest way to avoid that Exception is just calling break statement after calling skipElement() method. In that case code for getting attributes will not be executed, thus Exception will not be thrown.

        if("dad".equals(currentElementName) == true){
            logger.info("isStartElement: " + xmlsr.isStartElement());
            logger.info("Element BEGIN: " + currentElementName);
            xmlsr.skipElement();
            break;                  //the cure for IllegalStateException
        }

I'm sorry I gave you no chance to answer my original question becouse of to much code hidden.

نصائح أخرى

I tried this in java 1.6 (jdk1.6.0_30) with woodstox-core-lgpl-4.1.4.jar, stax2-api-3.1.1.jar on the library path. My java file is this:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;

import org.codehaus.stax2.XMLStreamReader2;
import org.codehaus.stax2.evt.XMLEvent2;

public class Skip {

    public static void main(String[] args) throws FileNotFoundException,
            XMLStreamException {
        System.setProperty("javax.xml.stream.XMLInputFactory",
                "com.ctc.wstx.stax.WstxInputFactory");
        System.setProperty("javax.xml.stream.XMLOutputFactory",
                "com.ctc.wstx.stax.WstxOutputFactory");
        System.setProperty("javax.xml.stream.XMLEventFactory",
                "com.ctc.wstx.stax.WstxEventFactory");

        FileInputStream fis = new FileInputStream(new File("family.xml"));
        XMLInputFactory xmlif = XMLInputFactory.newFactory();
        XMLStreamReader2 xmlsr = (XMLStreamReader2) xmlif
                .createXMLStreamReader(fis);

        String currentElementName = null;
        while (xmlsr.hasNext()) {

            int eventType = xmlsr.next();

            switch (eventType) {

            case (XMLEvent2.START_ELEMENT):
                currentElementName = xmlsr.getName().toString();

                if ("dad".equals(currentElementName) == true) {
                    System.out.println("isStartElement: "
                            + xmlsr.isStartElement());
                    System.out.println("Element BEGIN: " + currentElementName);
                    xmlsr.skipElement();
                }
                else {
                    System.out.println(currentElementName);
                }

            }
        }
    }
}

Works like a charm. Output is

family
mom
data
isStartElement: true
Element BEGIN: dad

Since Woodstox is a StAX (JSR-173) compliant parser, you could use a StAX StreamFilter to exclude events corresponding to certain elements. I prefer this approach so that you can keep the filtering logic separate from your application logic.

Demo

import javax.xml.stream.*;
import javax.xml.transform.stream.StreamSource;

public class Demo {

    public static void main(String[] args) throws Exception {
        XMLInputFactory xif = XMLInputFactory.newFactory();
        StreamSource xml = new StreamSource("src/forum14326598/input.xml");
        XMLStreamReader xsr = xif.createXMLStreamReader(xml);
        xsr = xif.createFilteredReader(xsr, new StreamFilter() {

            private boolean accept = true;

            @Override
            public boolean accept(XMLStreamReader reader) {
                if((reader.isStartElement() || reader.isEndElement()) && "dad".equals(reader.getLocalName())) {
                    accept = !accept;
                    return false;
                } else {
                    return accept;
                }
            }

        });

        while(xsr.hasNext()) {
            if(xsr.isStartElement()) {
                System.out.println("start: " + xsr.getLocalName());
            } else if(xsr.isCharacters()) {
                if(xsr.getText().trim().length() > 0) {
                    System.out.println("chars: " + xsr.getText());
                }
            } else if(xsr.isEndElement()) {
                System.out.println("end: " + xsr.getLocalName());
            }
            xsr.next();
        }
    }

}

Output

start: family
start: mom
start: data
end: data
end: mom
end: family

It looks like the method xmlsr.skipElement() is the one that must consume the XMLEvent2.START_ELEMENT event. And since you already consumed it (xmlsr.next()), that method throws you an error.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top