Question

Background

I have recently taken up a new job as a developer at a company where I have inherited an in-house CMS built approximately 5 years ago. The CMS constructs pages in XML which are then output to the page as HTML. Some of the markup is generated within the back-end PHP, whereas some is exposed to the user in the administration layer, to enable editing. So far I have observed that none of the markup is semantic: 95% of it is plain text wrapped in tags, like so:

<div>
    This is a heading
</div>
<div>
    This is a subheading
</div>
<div>
    This is a paragraph.
    <div>
        <a href='#'>This is a button</a>
    </div>
</div>

(The markup has nested divs where required and utilises classes, but the above is intended as a demonstration that semantic tags are not in use.)

I am the only developer responsible for front-end work at this company: the previous front-end developer has left. I was informed in the handover that semantic markup should never be used, and that I should maintain the markup style going forward. I was unable to elicit any technical explanations for this structural decision, and the other employees at the company do not know either.

For now I am going to maintain the system consistently, but in the meantime I am investigating why this structural choice might have been made. Originally the platform would have supported IE8, but in the present day we support IE9 and above. Experiments with the templating system show that the CMS can handle semantic HTML, and my testing demonstrates that the generated webpages function as expected. I can detect no problems caused by the introduction of HTML5 semantic elements in any modern browser (though I have not yet attempted to test in IE8).

 Question

What are possible technical reasons for deliberately excluding the use of semantic markup from a web project?

NB. Some companies have atypical in-house rules, and working at those companies entails following the rules, regardless of opinion. My question is not whether I should maintain the status quo or change it, but what reasons might have affected the design choices in the first place.

Was it helpful?

Solution

Using semantic markup has a number of advantages:

  • We get sensible default styles for many things like emphasis or tables or links.
  • We get a number of user-interface elements with pre-built appearance and behaviour, like <form>, <a>, <input>, <select>, or <details>.
  • We get a base level of accessibility since assistive technologies can interpret the HTML tags.
  • Semantic markup may help search engines and other user agents to better understand your content (e.g. services like Pocket or Instapaper or the built-in readability view of some browsers try to extract relevant content for a better user experience).

But these might also be unnecessary, or may be drawbacks.

  • Default styles are inconsistent between browsers. You more or less have to reset them first before your can build upon them with custom styles.
  • Default styles don't help much if you aren't writing text. In fact, they tend to be a hindrance when doing layout. You pretty much end up with <div class="component"> regardless of your intentions.
  • It is difficult to style some built-in elements. If you're trying to build a consistent user experience, it might become interesting to develop your own user interface elements. E.g. you can't build a rich-text editor from a <textarea>. You can't consistently style scrollbars.
  • You might not need accessibility (unlikely in many regulatory environments), or may prefer to implement the necessary accessibility features yourself through CSS and aria-* attributes.
  • You might not need to be search-engine friendly. Also, there's some evidence Google is more interested in the styled appearance of your page than in your page source.

But that assumes the original developer considered these cases and made a conscious decision for or against them. This may not have been the case, and the actual cause may be that:

  • it was considered too difficult to educate users of the admin interface about available tags and their proper use (my “favourite” abuse: <br><br> as a paragraph separator because people don't understand <p>…).
  • there was some FUD (fear, uncertainty, disinformation) about browser support for some semantic tags, so it was considered to be safer to use none of them.
  • additional tags would have complicated the XML→HTML transformation too much.

As the original developer has left, they have no say in the future development. Whatever it was, you don't know their reasoning behind the design decision to only use <div>s. If none of the above ideas make sense in the context of your CMS, you could assume that no real reason exists. In that case, and if semantic HTML would provide value to the project, it would be good to suggest the appropriate changes.

OTHER TIPS

The div element is a generic one that can be used for anything. With styling, one can produce the same visual output as any other element. However, browsers come with their own default styling and you would be duplicating that effort if you are replacing it with your own but chances are you are writing your own anyway to some extent to create a visual style.

With the div element being so generic, it's easier to use it instead of having to think about the semantics of the element or how it applies to the HTML outline. So it makes writing code easier. Some would say it's lazy but some of the new elements are open to a variety of interpretations and confusion.

In HTML5, the specification has changed and it is now advised to use the div element as a last resort. It is now suggested, by the spec, that you should use the newer elements, such as section, article, and so on, to give better semantic meaning to the HTML outline of the document.

Authors are strongly encouraged to view the div element as an element of last resort, for when no other element is suitable. Use of more appropriate elements instead of the div element leads to better accessibility for readers and easier maintainability for authors.

Therefore, better accessibility and maintainability might be considered a technical reason.

As far as SEO goes, no search engine gives any weight to the new HTML5 semantic elements. No search engine uses the specified HTML outline either. Google has stated they won't give consideration to HTML5 elements until more people use them but, going forward, this could change and probably will.

...I definitely wouldn't want to stand in the way of your implementing parts of your site with HTML5, but I also wouldn't expect to see special treatment of your content due to the HTML5 markup at the moment.

Note that the above quote is six years old but I have not seen anything from Google that says differently.

Licensed under: CC-BY-SA with attribution
scroll top