Question

I allow customers to share xml files on a site, they can upload and download the files. I use php and simpleXML to parse the file but do not do any checks to validate the file right now other than:

if( ! $xml = simplexml_load_file('xml/'.$xml_file) ){
    echo 'unable to load XML file';
// run redirect script and explain there was a problem with the XML file
}

What i tried googling and couldn't really find a definitive answer to was if it was possible for someone to upload a virus within the XML. It sounds far fetched but the general consensus was any file can have a virus. I couldn't find any information on how they implant a virus in the XML file or what i could do to check for malicious code.

I know exactly what format the XML file should contain i.e what nodes are there, can i just create a list of nodes that are acceptable and if it contains any nodes that are not accepted to reject the file?

Is the virus implanted within the nodes something like

<node ="<?php delete * from table where id > 0; ?>" 

or something like that, so would that mean the virus is only enabled when its run in a internet browser or application?

All the xml file is used for is to store settings and table attributes that will be parsed to display a table, the xml file will generally have the values and attributes like width and colors of cells in the table, it will also be run in a desktop application written in C# to do pretty much the same thing.

Is there a way to escape injection/viruses in XMLs or is there any resource you can point me to where i can read up on this.

Was it helpful?

Solution

There are a few generic attacks against systems processing XML:

Then there's the possibility that someone could target a specific XML processor. Simply look at the CVEs filed against libxml2, for example.

So generally, you should validate XML files from external sources. Most of the time, you'll want to disallow internal DTD subsets, especially entity declarations. The structure of XML files can be easily checked using DTDs, XML Schema, or RELAX NG.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top