DBpedia Lookup URIS coming with the word ''Category"

Question 1

I asked the author of the code, Mark Watson, for some help and he answered me:

You can make this simple code change:

      //if ("URI".equals(lastElementName)) tempBinding.put("URI", s);
      if ("URI".equals(lastElementName) && s.indexOf("Category")==-1
&& tempBinding.get("URI") == null) {
        tempBinding.put("URI", s);
      }

that is comment out 1 line, add the next three.

That's it!

Question 2

When you do a search for, e.g., "History of Berlin", you're requesting a URL like

http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=History%20of%20Berlin

and you're getting back an XML result like this:

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfResult 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://lookup.dbpedia.org/">
    <Result>
        <Label>Museum für Naturkunde</Label>
        <URI>http://dbpedia.org/resource/Museum_für_Naturkunde</URI>
        <Description></Description>
        <Classes></Classes>
        <Categories></Categories>
        <Templates></Templates>
        <Redirects></Redirects>
        <Refcount>155</Refcount>
    </Result>
    <Result>
        <Label>History of Berlin</Label>
        <URI>http://dbpedia.org/resource/History_of_Berlin</URI>
        <Description>
            Berlin is the capital city of Germany. Berlin is a young city by European standards, founded in the 12th century.
        </Description>
        <Classes></Classes>
        <Categories>
            <Category>
                <Label>History of Berlin</Label>
                <URI>http://dbpedia.org/resource/Category:History_of_Berlin</URI>
            </Category>
            <Category>
                <Label>History of Germany by location</Label>
                <URI>http://dbpedia.org/resource/Category:History_of_Germany_by_location</URI>
            </Category>
        </Categories>
        <Templates></Templates>
        <Redirects></Redirects>
        <Refcount>14</Refcount>
    </Result>
</ArrayOfResult>

You're right that there are URI elements with category URIs, e.g.,

<URI>http://dbpedia.org/resource/Category:History_of_Berlin</URI>

but what you should note is that from the root of the document, there are

ArrayOfResult/Result/Categories/Category/URI

elements, whereas the elements that you want are

ArrayOfResult/Result/URI

elements. You just need to process your XML a bit differently; don't get all the content from all URI elements, but just from the URI elements that are children of Result elements. I'm not all that familiar with SAX parsing, but I think the important point is that once you've entered a Result, you should only grab the URI if you haven't entered another child element of Result.

DBpedia Lookup URIS coming with the word ''Category"

Code