문제

I have a file that has several XML documents like below in sequence.

<?xml version="1.0"?><Node>...<Node>...</Node>...</Node><?xml version...

which repeats several times.

I use Java, I have a FileChannel opened for the file and I have a byte buffer to read. Would appreciate if there is a built in way or an easier way or an already solved way to do a partial parsing of XML bytes with Java. For example like this:

FooParser parser = new FooParser();

while (...)
{
    buffer.flip();
    parser.parse(buffer);
    buffer.compact();
    if (parser.done())
    {
        xmlDocs.add(parser.xml());
        parser.reset();
    }
    file.read(buffer);
    ...
}
도움이 되었습니까?

해결책

There's nothing in the api that I know of that will parse multiple xml docs in a single stream. I think you're going to have to scan for the <?xml ... tags yourself and split up the input. The parser won't know that it's hit the next xml document until it reads the tag. At that point it will choke and the opening tag for the next xml doc will have already been read.

Actually, now that you mention it, you may be able to use a pull parser to do what you want. But I'm pretty sure the SAX and DOM parsers in the api won't do what you want.

다른 팁

I had to do something like this and I have answered (myself) here with a Reader subclass that wraps everything for simpler use.

It is common to check for the <? sequence at the start of the XML file because an XML file has to begin with the xml declaration actually (a BOM is not to be expected in the middle of the file). So I would take a look at the encoding and split the file as already suggested at every occurance of <? and "xml" afterwards...

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top