HTML Entity Encoding vs URL Encoding: Key Differences
HTML entity encoding converts characters like <, >, &, and " to HTML entities (<, >, &, ") for safe display in web pages. URL encoding converts characters to %XX format for safe transmission in URLs. They protect against different types of attacks and are used in different contexts.
What is HTML Entity Encoding?
HTML entity encoding (also called HTML escaping) converts characters that have special meaning in HTML into entity references that browsers display as the literal character rather than interpreting as markup. The most important characters to encode are <, >, &, ", and '.
Without HTML encoding, a string like <script>alert('XSS')</script>would be executed as JavaScript by the browser. With encoding, it becomes <script>alert('XSS')</script>and is displayed as harmless text.
HTML entities can be written in two forms: named entities like &, <, and ", or numeric entities like &, <, and ". Numeric entities use the Unicode code point of the character and can represent any Unicode character.
<!-- Before HTML encoding -->
<p>The formula is: x < y & y > z</p>
<!-- Browser interprets < as start of a tag! -->
<!-- After HTML encoding -->
<p>The formula is: x < y & y > z</p>
<!-- Browser displays: x < y & y > z -->
<!-- Common HTML entities -->
& → & (ampersand)
< → < (less than)
> → > (greater than)
" → " (double quote)
' → ' (single quote / apostrophe)
→ (non-breaking space)What is URL Encoding?
URL encoding (percent-encoding) converts characters that are not safe in URLs into a percent sign followed by two hexadecimal digits representing the byte value. It is defined by RFC 3986 and ensures that URLs only contain valid ASCII characters that can be safely transmitted over the internet.
URL encoding is necessary because URLs have a strict syntax where certain characters serve as delimiters (/, ?, &, #, =). When these characters appear in data (like query parameter values), they must be encoded to prevent them from being interpreted as structural elements.
// URL encoding examples
space → %20
& → %26
= → %3D
? → %3F
# → %23
/ → %2F
+ → %2B
// Full URL with encoded query parameter
https://example.com/search?q=salt%20%26%20pepper
// The value of q is "salt & pepper"Comparison Table
| Feature | HTML Entity Encoding | URL Encoding |
|---|---|---|
| Purpose | Safe display in HTML pages | Safe transmission in URLs |
| Standard | HTML/WHATWG specification | RFC 3986 |
| Format | &name; or &#number; | %XX (hex byte value) |
| Space becomes | (non-breaking) or left as-is | %20 or + |
| & becomes | & | %26 |
| < becomes | < | %3C |
| Prevents | XSS attacks, broken HTML | Broken URLs, injection attacks |
| Context | HTML document body and attributes | URL paths, query strings, fragments |
| JS function | No built-in (use a library or DOM APIs) | encodeURIComponent() |
When Do You Need Both?
There are common situations where you need both HTML encoding and URL encoding. The most frequent case is when you place a URL inside an HTML attribute, such as an hrefor src attribute. The URL must first be properly URL-encoded, and then any special HTML characters in the result must be HTML-encoded.
<!-- Step 1: URL-encode the query parameter value -->
<!-- Value: "Tom & Jerry" → URL encoded: "Tom%20%26%20Jerry" -->
<!-- Step 2: Build the URL -->
<!-- URL: https://example.com/search?q=Tom%20%26%20Jerry -->
<!-- Step 3: HTML-encode the URL for use in an href attribute -->
<a href="https://example.com/search?q=Tom%20%26%20Jerry">
Search for Tom & Jerry
</a>
<!-- In this case, the % sequences don't need HTML encoding -->
<!-- because % is not a special HTML character -->
<!-- But if the URL contains & as a delimiter, it MUST be HTML-encoded: -->
<a href="https://example.com/search?q=cats&sort=name">
Search cats sorted by name
</a>
<!-- Without &, the browser might interpret &sort as an HTML entity -->In modern frameworks like React, Vue, and Angular, HTML encoding is handled automatically when you use template expressions or JSX. However, URL encoding still needs to be done explicitly when building URLs from dynamic data.
// React - HTML encoding is automatic, URL encoding is not
function SearchLink({ query }) {
// URL encoding must be explicit
const url = '/search?q=' + encodeURIComponent(query);
// HTML encoding is automatic in JSX
return <a href={url}>Search for {query}</a>;
// React automatically HTML-encodes both the href and text content
}XSS Prevention: Which Encoding to Use Where
Cross-Site Scripting (XSS) prevention requires applying the correct encoding based on the context where user input is placed. Using the wrong type of encoding in a given context provides no protection.
- HTML body context: Use HTML entity encoding. This prevents injected tags like
<script>from being interpreted as markup. - HTML attribute context: Use HTML entity encoding and always quote attribute values. Unquoted attributes can be broken out of even with HTML encoding.
- URL context (href, src): Use URL encoding for dynamic parts, then HTML-encode the entire URL if placing it in HTML. Validate that the URL starts with a safe scheme (
https:) to preventjavascript:URLs. - JavaScript context: Use JavaScript string escaping (JSON.stringify or a dedicated library). Neither HTML encoding nor URL encoding is sufficient in a JavaScript context.
- CSS context: Use CSS escaping. HTML and URL encoding do not protect against CSS injection.
The key principle is: encode for the output context, not the input source. A single piece of user input might need different encoding depending on where it appears in the response. Never rely on input validation alone for XSS prevention; always apply output encoding.