Is writing self closing tags for elements not traditionally empty bad practice?

https://stackoverflow.com/questions/348736

20-08-2019
|

Question

I have noticed jQuery (or is it Firefox) will turn some of my <span class="presentational"></span> into <span class="presentational" />

Now my question is, is this okay to write my markup like this? Will any browsers choke on it?

Personally, I think it looks cleaner to do <span class="presentational" /> if it's going to be empty.

Solution

I'm assuming your question has to do with the red trailing slash on self-closing elements when you view source in Firefox. If so, you've stumbled into one of the most vehement, yet simultaneously passive aggressive debates in the browser maker vs. web developer wars. XHTML is NOT just about a document's markup. It's also about how documents are meant to be served over the web.

Before I begin; I'm trying hard not to take sides here.

The XHTML 1.1 spec says that a web server should serve XHTML with a Content-Type of application/xhtml+xml. Firefox is singling out those trailing slashes as invalid because your document is being served as text/html rather than application/xhtml+xml. Take these two examples; identical markup, one served as application/xhtml+xml, the other as text/html.

http://alanstorm.com/testbed/xhtml-as-html.php

http://alanstorm.com/testbed/xhtml-as-xhtml.php

Firefox flags the trailing slash in the meta tag as invalid for the document served with text/html, and valid for the document served with application/xhtml+xml.

Why this is Controversial

To a browser developer, the point of XHTML is you can treat your document as XML, which means if someone sends you something that's not valid, the spec says you don't have to parse it. So, if a document is served as application/xhtml+xml and has non-well formed content, the developer is allowed to say "not my problem". You can see that in action here

http://alanstorm.com/testbed/xhtml-not-valid.php

When a document is served as text/html, Firefox treats it as a plain old HTML document and uses the forgiving, fix it for you, parsing routines

http://alanstorm.com/testbed/xhtml-not-valid-as-html.php

So, to a browser maker, XHTML served as text/html is ludicrous, because it's never treated as XML by the browser's rendering engine.

A bunch of years ago, web developers looking to be more than tag monkeys (Disclaimer: I include myself as one of them) started looking for ways to develop best practices that didn't involved thrice nested tables, but still allowed a compelling design experience. They/We latched onto XHTML/CSS, because the W3C said this was the future, and the only other choice was a world where a single vendor (Microsoft) controlled the defacto markup spec. The real evil there being the single vendor, and not so much Microsoft. I swear.

So where's the controversy? There are two problems with application/xhtml+xml. The first is Internet Explorer. There's a legacy bug/feature in IE where content served as application/xhtml+xml will prompt the user to download the document. If you tried to visit the xhtml-as-xhtml.php listed above with IE that's likely what happened. This means if you want to use application/xhtml+xml, you have to ~~browser sniff for IE~~, check the Accepts header and only serve application/xhtml+xml to those browsers that accept it. This is not as trivial as it sounds to get right, and also went against the "write once" principle that the web developers were striving for.

The second problem is the harshness of XML. This is, again, one of those flame prone issues, but there's some people who think a single bad tag, or single character improperly encoded shouldn't result in a user not seeing the document they want to. In other words, yes, the spec says you should stop processing XML if it's not well formed, but the user doesn't care about the spec, they care that their cat's website is broken.

Adding even more gasoline to the issue is the XHTML 1.0 (not 1.1) spec says that XHTML documents may be served as text/html, assuming certain compatibility guidelines are followed. Things like the img tag being self closing and the like. The key word here is may. In RFC speak, may means optional. Firefox has chosen NOT to treat documents served with an XHTML doctype but a content type of text/html as XHTML. However, the W3C validator will happily report these documents as valid.

I'll leave the reader to ponder the simultaneous wonder/horror of a culture that writes a document to define what they mean by the word may.

Moving Forward

Finally, this is what the whole HTML 5 thing is about. XHTML became such a political hot potato that a bunch of people who wanted to move the language forward decided to go in another direction. They produced a spec for HTML 5. This is currently being hashed out in the W3C, and expected to finish sometime in the next decade. In the meantime, browser vendors are picking and choosing features from the in-progress spec and implementing them.

Updates from the Comments

In the comments, Alex points out that if you're going to sniff for something, you should check the Accept header to see if application/xhtml+xml is accepted by the user agent.

This is absolutely correct. In general, if you're going to sniff, sniff for the feature, not for the browser.

OTHER TIPS

An addition to the other answers: in IE, having elements such as <span /> in your mark-up will cause all kinds of problems with DOM traversal methods in JavaScript. Have a look at the following XHTML document:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <title>Test</title>
    <script type="text/javascript">
        function show() {
            var span = document.getElementById("span");
            alert(span.innerHTML);
        }
    </script>
</head>
<body onload="show();">
<p id="p1">Paragraph containing some text followed by
           an empty span<span id="span"/></p>
<p id="p2">Second paragraph just containing text</p>
</body>
</html>

The idea is that when the page loads, the JavaScript will get a reference to the empty span and display its HTML contents. That will be an empty string, right? Not in IE it won't. In IE, you get all the content after the span in the whole document:

</P>
<P id=p2>Second paragraph just containing text</P>

Also, the second <p> shows up in the span's childNodes collection. That same <p> is also in the body's childNodes collection, meaning a node can effectively have multiple parents. This isn't terribly good news for scripts that rely on traversing the DOM.

I have also blogged about this.

Yes. It is. It'll cause problems in certain cases for old browsers.

<script type='text/javascript' src='script.js' />

In this case, the old browser might not understand that <script> tag has ended.

Served as application/xhtml+xml, <span /> means create a span element with no content.

Served as text/html, <span /> means create a span element where the contents of the element follow this tag until the </span> tag is encountered, or another tag (or EOF) that implicitly closes the element is encountered. i.e. in this case <span /> means the same as <span>.

Aside: HTML 5 defines both and HTML and XHTML serializations, so it doesn't affect this issue one way or another. It does require, like XHTML 1.1, that XHTML be served as application/xhtml+xml, unlike XHTML 1.0. In effect though, this changes nothing as all browsers treat any version of XHTML served as text/html as tag soup.

Also worth noting is that an <?xml ...?> declaration before the doctype throws IE into quirks mode.

See the note on the subject form the XHMTL working group: http://www.w3.org/TR/xhtml-media-types/

In short — it is fine if your XHTML is going to be treated as XHTML. If you are going to pretend it is HTML (which you need to do if you want it to be loaded by Internet Explorer (including version 8, latest at the time of writing) then you have to jump through hoops).

The hoops are sufficiently annoying that I would recommend most people stick to HTML 4.01.

Generally it's not a problem to use shorthand for empty elements, but there are some exceptions where it can cause problems.

<script> is an important one that needs to be closed with </script> to avoid issues.

Another is <meta> which works much better with spiders written as <meta></meta> instead of <meta />

Not exactly the question, but related, in terms of formatting, versions of IE have problems with just empty elements such as <div></div> or <div />. In this case, <div> </div> is required to maintain the formatting.

It should be explicitely said that there are no self-closed tags in HTML, so whenever a browser decides to treat your XHTML as HTML, it will not recognize that the tag is closed. Not a problem for tags that don't have to be closed in HTML, like <img>, but obviously bad with tags like <span>.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow