The XML dump contains exactly what the library is offering you: the page text along with some basic metadata. It doesn't contain any metadata about categories or external links.
The way I see it, you have three options:
- Use the specific SQL dumps for the data you need, e.g. categorylinks.sql for categories or externallinks.sql for external links. But there is no dump for references (because MediaWiki doesn't track those).
- Parse the wikitext from the XML dump. This would have problems with templates.
- Use your own instance of MediaWiki to parse the wikitext into HTML and then parse that. This could potentially handle templates too.