Your code is iterating through the children of the XML root element. Since your XML document (looked at the bloomberg one) contains:
<urlset ...>
<url ...>
...
</url>
<url ...>
...
</url>
...
</urlset>
The output is the list of url
elements.
You haven't stated what output you would like to get. However, you most likely need to either iterate through each XML element recursively or use xpath to extract specific parts of the document.
Example: to extract publication_date
fields:
import lxml.etree
tree = lxml.etree.parse(self.xml_file)
root = tree.getroot()
for pd in root.xpath("//*[local-name()='publication_date' and namespace-uri()='http://www.google.com/schemas/sitemap-news/0.9']"):
print pd.text