Question

I have a XML which i am trying to parse.

<Tests>
   <Test>
     <Blocks>
         <Block>
            <BlockId>2</BlockId>
            <Name>CCCC</Name>
            <Type>Action</Type>
            <TaskId>2</TaskId>
            <Send>
               <WId>284</WId>
               <BlockId>14</BlockId>
            </Send>
         </Block>
         <Block>
            <BlockId>10</BlockId>
            <Name>START VM4</Name>
            <Type>Action</Type>
            <TaskId>10</TaskId>
            <Send />
         </Block>
         <Block>
            <BlockId>12</BlockId>
            <Name>SHUT</Name>
            <Type>Action</Type>
            <TaskId>12</TaskId>
            <Send />
         </Block>
     </Blocks>
 </Tests>
</Test>

I am using SAX to parse this. Everything works fine, but every time i loop through, i should get a block with id 2 and then another block with blockid 10 and then 12. and i am then adding to all these blocks to the test.

Portion of my code is:

public void startElement(String uri, String localName, String qName,
        Attributes attributes) throws SAXException {
    nqName = qName;
    tag_name_List.setElementAt(nqName, level);
    level = level + 1;

}

public void endElement(String uri, String localName,
        String qName) throws SAXException {
    level = level - 1;
    tag_name_List.removeElementAt(level);
}

public void characters(char ch[], int start, int length) throws SAXException {

    if (level != 0) {
        ////////////////Some code
    } else if (level == 5
            && tag_name_List.elementAt(1).equals("Test") 
            && tag_name_List.elementAt(2).equals("Blocks") 
            && tag_name_List.elementAt(3).equals("Block") 
            && (nqName.equalsIgnoreCase("BlockId"))) {
        block = new Block();
        test.addBlock(block);
        block.setId(new String(ch, start, length));
        block.setWorkflowId(workflow.getId());

    } else if (level == 5 && ...) {  
        ////// Code continues

NB This is a huge xml and huge code, so just sharing partly...

But the issue here is:

  • the first time I get id as 2,
  • then "\n "
  • then again id as 10
  • and then "\n "
  • then id 12
  • and then "\n ".

I am not sure why i am getting these "\n ".

I can put a if condition to avoid that entity, but if i do so i lose some information attached to that id, which later gets associated with that "\n " id.

Has anyone faced this and can give a pointer.

Let me know if more information is needed.

After debugging the code i found that, it is basically taking the "\n " from the end of

<BlockId>14</BlockId>

Since there will be a \r and "\n " for the next line here.

How can i avoid this?

Was it helpful?

Solution

You assign nqName = qName. Do you ever change that value until the next iteration?

If you don't change that value when you leave the context of the BlockId element, it will still be equal to BlockId when you are outside the element but not yet inside Name, for example. And the characters() method will read all the whitespace it finds there.

There should probably reset nqName in your endElement(). Try adding

nqName = null;

to your endElement() method.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top