Question

I now develop websites and XML interfaces since 7 years, and never, ever came in a situation, where it was really necessary to use the &gt; for a >. All disambiguition could so far be handled by quoting <, &, " and ' alone.

Has anyone ever been in a situation (related to, e.g., SGML processing, browser issues, XSLT, ...) where you found it indespensable to escape the greater-than sign with &gt;?

Update: I just checked with the XML spec, where it says, for example, about character data in section 2.4:

Character Data

[14]      CharData       ::=      [^<&]* - ([^<&]* ']]>' [^<&]*)

So even there, the > isn't mentioned as something special, except from the ending sequence of a CDATA section.

This one single case, where the > is of any significance, would be the ending of a CDATA section, ]]>, but then again, if you'd quote it, the quote (i.e., the literal string ]]&gt;) would land literally in the output (since it's CDATA).

Was it helpful?

Solution

You don't need to absolutely because almost any XML interpreter will understand what you mean. But still you use a special character without any protection if you do so.

XML is all about semantic, and this is not really semantic compliant.

About your update, you forgot this part :

The right angle bracket (>) may be represented using the string " > ", and must, for compatibility, be escaped using either " &gt; " or a character reference when it appears in the string " ]]> " in content, when that string is not marking the end of a CDATA section.

The use case given in the documentation is more about something like this :

<xmlmarkup>
]]>
</xmlmarkup>

Here the ]]> part could be a problem with old SGML parsers, so it must be escaped into = ]]&gt; for compatibilities reasons.

OTHER TIPS

I used one not 19 hours ago to pass a strict xml validator. Another case is when you use them actually in html/xml content text (rather than attributes), like this: <.

Sure, a lax parser will accept most anything you throw at it, but if you're ever worried about XSS, &lt; is your friend.

Update: Here's an example where you need to escape > in Firefox:

<?xml version="1.0" encoding="utf-8" ?>
<test>
    ]]>
</test>

Granted, it still isn't an example of having to escape a lone >.

Not so much as an author of (x)html documents, but more as a user of sloppy written comments fields in websites, that "offer" you to insert html.

I mean if you do your site the right way, you wouldn't hardcode your content anyway, right? So your call to htmlentities or whatever (long time no see, php) would take care of replacing special characters for you. So sure, you wouldn't manually type &gt; but I hope you take measures so > is automatically replaced.

I just thought of another example, where you need to quote > in HTML5 (not XHTML5) documents: If you need it in attributes without quotes (which is something, that can be argued of course).

<img src=arrow.png alt=&gt;>

should be equivalent to XHTML

<img src="arrow.png" alt=">" />

But then again, (?<!X)HTML is not SGML.

Imagine that you have the following text this is a not a ]]> nice day and you decide to surround it by CDATA sections <![CDATA[this is a not a ]]> nice day]]>.

In order to avoid that (and for allowing parsing of SGML fragments with unterminated marked sections), clause 10.4 of ISO 8879:1986 declares that the occurrence of ]]> outside a marked section is an error.

Also, in the times of SGML marked sections were very popular, as they were not only used for CDATA (as in XML), but also for RCDATA (only entities and character references allowed) and IGNORE and INCLUDE (which allowed for recognition of markup inside them).

For instance, in SGML one could write:

 <!ENTITY %WHATTODO "INCLUDE">
 <![%WHATTODO;[<b>]]&gt;</b>]]>

Which is equivalent to:

 <b>]]&gt;</b>
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top