Url encoding

URL encoding, also known as percent-encoding, is a method used to encode special characters in Uniform Resource Identifiers (URIs) so they can be safely transmitted over the internet. It ensures that characters not allowed in URLs—such as spaces, reserved symbols, or non-ASCII characters—are converted into a standard format using the percent sign (%) followed by two hexadecimal digits representing the character’s byte value.

Why URL Encoding Is Important

Prevents parsing errors: Characters like ?, &, #, /, and spaces have special meanings in URLs. Encoding them avoids misinterpretation by browsers and servers.
Supports non-ASCII characters: URLs are based on ASCII, so characters from other languages (e.g., €, ©, ш) must be encoded using UTF-8 and then percent-encoded.
Enables data transmission: When sending data via query strings (e.g., in web forms), special characters must be encoded to ensure integrity and correct interpretation.

Common Encoding Examples

Character	Purpose in URL	Encoded Form
Space	Separates words	`%20` or `+` (in query strings)
`&`	Separates query parameters	`%26`
`?`	Starts query string	`%3F`
`#`	Indicates fragment/anchor	`%23`
`@`	Separates user/password from domain	`%40`
`+`	Represents space in query strings	`%2B`
`%`	Indicates encoding	`%25`

How It Works

Encoding: Replaces unsafe or reserved characters with %XX, where XX is the hexadecimal representation of the character’s UTF-8 byte value.
Decoding: Reverses the process to restore the original character.

Tools & Functions

Online Tools: Use free tools like meyerweb.com/eric/tools/dencoder/ or jam.dev/utilities/url-encoder to encode/decode URLs instantly.
JavaScript: Use encodeURIComponent() for individual values (e.g., query parameters), or encodeURI() for entire URIs.
Python: Use urllib.parse.quote() or urllib.parse.quote_plus() for encoding.
PHP: Use rawurlencode() for proper encoding.

⚠️ Note: While + is often used to represent a space in query strings (legacy compatibility), %20 is the correct and more universal encoding. Always use %20 in paths and when full precision is required.

Security Considerations

Double encoding (encoding twice) can bypass filters in security systems and is exploited in attacks like path traversal, XSS, and SQL injection.
Examples include CVE-2001-0333 (IIS directory traversal) and CVE-2004-1939 (XSS via double-encoded slashes).

Standards

Defined in RFC 3986 and RFC 3987 (IRI).
UTF-8 is the standard character encoding used for URL encoding.

In summary, URL encoding ensures that data is correctly interpreted across systems, enabling reliable communication over the web. Always use UTF-8 and %XX format for safe, standardized encoding.

W3Schools

w3schools.com › tags › ref_urlencode.ASP

HTML URL Encoding Reference

URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits.

Eric Meyer

meyerweb.com › eric › tools › dencoder

URL Decoder/Encoder

The URL Decoder/Encoder is licensed under a Creative Commons Attribution-ShareAlike 2.0 License.

Discussions

What are all the reasons behind url encoding?

Some characters are reserved. That is to say, they have semantic meaning within a URL itself: forward slash /, ampersand &, the question mark ?, etc. all mean something. So, these need to be encoded when part of the data payload of a URL so as not to break the URL parser. Additionally, url encoding can be used to represent non-printable characters or binary data within a URL. More on reddit.com

r/django

February 20, 2019

URL encoding using a `+` instead of ` ` for a space - Developers - Talk TW

Would there be any interest from the community – and any support from the core team – for changing the URL fragment encoding to allow + instead of to encode a space? It’s simply a matter of prettier URLs, but to me that is likely enough of a reason. I would rather look at this: ... More on talk.tiddlywiki.org

talk.tiddlywiki.org

September 5, 2023

utf 8 - What is the proper way to URL encode Unicode characters? - Stack Overflow

I know of the non-standard %uxxxx scheme but that doesn't seem like a wise choice since the scheme has been rejected by the W3C. Some interesting examples: The heart character. If I type this int... More on stackoverflow.com

stackoverflow.com

URL encoding the space character: + or ? - Stack Overflow

When is a space in a URL encoded to +, and when is it encoded to ? More on stackoverflow.com

stackoverflow.com

Videos