Difference between Url Encode and HTML encode

https://stackoverflow.com/questions/1812473

06-07-2019
|

Question

What’s the difference between an URL Encode and a HTML Encode?

Solution

HTML Encoding escapes special characters in strings used in HTML documents to prevent confusion with HTML elements like changing

"<hello>world</hello>"

"&lt;hello&gt;world&lt;/hello&gt;"

URL Encoding does a similar thing for string values in a URL like changing

"hello+world = hello world"

"hello%2Bworld+%3D+hello+world"

OTHER TIPS

urlEncode replaces special characters with characters that can be understood by web browsers/web servers for the purpose of addressing... hence URL. For instance, spaces are replaced with %20, ' = %27 etc...

See these references:

HtmlEncode replaces special characters with character strings that are recognised by the HTML engine itself to render the content of the page - things like & becomes & or < = < > = < this prevents the HTML engine from interpreting these characters as parts of the HTML markup and therefore render them as if they were strings.

See this reference:

http://msdn.microsoft.com/en-us/library/ms525347.aspx

Both HTML and URL's are essentially very constrained languages. As a language they add meaning to specific keywords or operators. For both of these languages though, keywords are almost always single characters. For example

HTML: > and <
URL: / and :

In the use of each language though it is possible to use these constructs in a manner that does not ensure the meaning of the language. For instance this post contains a > character. I do not want it to be interpreted as HTML, just text.

This is where Encode and Decode methods come into play. These methods will respectively take a string and convert any of the characters that would otherwise be treated as keywords into an escaped form which will not be interpreted as part of the language.

For instance: Passing > into HtmlEncode will return >

HTMLEncode and URLEncode deal with invalid characters in HTML and URLs, or more accurately, characters that need to be specially written to be interpreted correctly. For example, in HTML the < and > characters are used to indicate tags. Thus, if you wanted to write a math formula, something like 1+1 < 2+2, the '<' would normally be interpreted as the beginning of a tag. HTMLEncoding turns this character into "<" which is the encoded representation of the less-than sign. URLEncoding does the same, but for URLs, for which the special characters are different, although there is some overlap.

I don't know what language you are working in, but the PHP manual for example provides good explanations.

URLEncode

Returns a string in which all non-alphanumeric characters except -_. have been replaced with a percent (%) sign followed by two hex digits and spaces encoded as plus (+) signs. It is encoded the same way that the posted data from a WWW form is encoded, that is the same way as in application/x-www-form-urlencoded media type. This differs from the » RFC 1738 encoding (see rawurlencode()) in that for historical reasons, spaces are encoded as plus (+) signs.

Read on

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow