I recommend restructuring this code into at least two parts.
I would create a download function that is given a URL and downloads the bytes associated with that URL. This should open and close the connection, and just return either the bytes downloaded or an error indication.
I would use this download processing as a 'function call' to download your XML bytes. Then parse the bytes that are obtained feeding these direct into your parser. If the data is properly constructed XML, it will have a header indicating the encoding used, so you do not need to worry about that, the parser will cope.
Once you have this parsed, then use the download function again to download the bytes associated with any images you want.
Regarding the SAX processing, have you reviewed this question: