Skip to main content

Prevent Extensible Markup Language External Entity attacks

Learning Objectives

After completing this unit, you’ll be able to:

  • Identify Extensible Markup Language (XML) External Entity (XXE) attacks.
  • List how to prevent XXE attacks.

Thwart Extensible Markup Language External Entities Attacks

Extensible Markup Language (XML) is a markup language, much like Hypertext Markup Language (HTML), that stores and transports data. An XML parser is a software library or package that provides an interface for an application running on a user’s machine to work with XML documents. It checks for proper format of XML documents and may also validate these. A parser’s goal is to transform XML into readable code.

One of the major security risks in the Open Web Application Security Project (OWASP) Top 10, an XXE attack, occurs when a weakly configured XML parser processes XML input containing a reference to an external entity. An attacker can use this flaw to extract data, execute a remote request from the server, scan internal systems, and more. As XML becomes the predominant method for publishing content to the web, this attack method is only likely to increase in prevalence. 

How XXE Attacks Work

The XML 1.0 standard includes an entity concept—essentially a storage unit. External entities can access local or remote content via a declared system identifier, usually a uniform resource identifier (URI) that can be followed by the XML processor. The XML processor then replaces occurrences of the named external entity with the contents dereferenced by the system identifier. If the system identifier contains tainted data and the XML processor dereferences it, the XML processor may disclose confidential information normally not accessible by the application. 

Since XXE attacks occur via the XML processing application, an attacker may use this trusted application to pivot to other internal systems. The attack could disclose internal content via Hypertext Transfer Protocol Secure (HTTPS) requests or even launch a Cross-Site Request Forgery (CSRF) attack. In some situations, an XML processor that has client-side memory vulnerabilities may be exploited by dereferencing a malicious URI. This can allow arbitrary code executed under the application account. Other attacks can access local resources that may not stop returning data, possibly impacting application availability if not enough threads or processes are released.

The application does not need to explicitly return the response to the attacker for it to be vulnerable to information disclosures. An attacker can leverage the domain name system (DNS) information to exfiltrate data through subdomain names to a DNS server that they control. This occurs when sensitive information is sent to a malicious domain via DNS requests.

Is Your Application Vulnerable to XXE Attacks?

Your systems are vulnerable to XXE attacks if the following occurs. 

  • An application parses XML documents.
  • Untrusted data is allowed within the system identifier portion of the entity, within the document type definition (DTD), which defines the structure and legal elements and attributes of an XML document.
  • An XML processor is configured to validate and process the DTD.
  • An XML processor is configured to resolve external entities within the DTD.

Preventing XXE Attacks

The safest way to prevent XXE attacks is to always disable DTDs (external entities) completely. Depending on the parser, the tool that translates code into machine usable instructions, the method should be similar to the following.

factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
Disabling DTDs also secures the parser against denial of service (DoS) attacks. If DTDs can’t be disabled completely, then external entities and doctypes must be disabled in a way specific to each parser. 

A properly secured server is safe from a hacker attempting an XXE attack.

Knowledge Check

Ready to review what you’ve learned? The knowledge check below isn’t scored—it’s just an easy way to quiz yourself. To get started, drag the term in the left column next to the matching description on the right. When you finish matching all the items, click Submit to check your work. To start over, click Reset.

Great work!

Now that we’ve learned how to identify and prevent XXE Attacks, let’s take a look at another common OWASP vulnerability, Broken Authentication and Session Management.

Resources

Keep learning for
free!
Sign up for an account to continue.
What’s in it for you?
  • Get personalized recommendations for your career goals
  • Practice your skills with hands-on challenges and quizzes
  • Track and share your progress with employers
  • Connect to mentorship and career opportunities