Why does XML have such verbose closing tags? [closed]

https://stackoverflow.com/questions/4370488

09-10-2019
|

Question

Yes, this probably shouldn't bug me.

But it does!!

Why does XML have such verbose closing tags? Not only does it make documents uglier for humans, it needlessly introduces the risk of mismatched (or misspelled!) opening and closing tags.

Even if we wanted to require closing tags, why do we need to include the name of the opening tag inside the closing tag? There is never any ambiguity in XML, because innermost tags must be closed before closing outer tags!

For example:

<thisIsSomewhatLong>
    Hello, world!
</thisIsSomewhatLong>

...is so much more verbose than:

<thisIsSomewhatLong>
    Hello, world!
</>

And it doesn't resolve any ambiguity, either for humans or for computers.

Does anyone know what the rationale is for this rule? What risks are avoided by disallowing empty closing tags?

Solution

Because it improves readability, XML was born not to be efficient or concise, just to be easy to work with.. and if you think having </> wouldn't create ambiguities it is just because you are indenting the code. If you leave out indentation (which is a really weaker constraint compared to having the name in a closing tag) then it becomes a mess.

A simple example?

<A><B><C><D>foo</><D>bar</></><H>baz</></></>

You think it's so readable? It's hard to understand where <H> is without counting closing tags..

OTHER TIPS

I can see one big advantage: missing closing tags are caught (by the human or computer) right away, rather than getting an error like Insufficient closing tags provided; please read through your 1000 line file and figure out where it happened.

What you suggest amounts to S-Expression. You know, the thingy all Lisp is written in, e.g. (thisisSomewhatLong Hello, world!). There are indeed some who argue that this is better, because it is way less verbose. They are right, it is less verbose. But like it or not, this verbosity also has advantages. Most notably, it allows early error detection. With SExprs or similar, missing a close paren or having one too much that means "there are mismatched parens, good luck finding you" (if you're lucky - if you make such a mistake twice, it evens out and could easily screw all the markup - although it could of course yield a structure that doesn't conform the schema (assuming you have something like this) which can allow slightly better error reporting).

Also see "XML is not S-Expressions".

Although you might read on the net otherwise, XML is primarily computer readable, and therefore, uses opening and closing tags for validity checking.

It is somewhat human readable; it is efficient for storing data that will be used by many applications, but ultimately, these tags exist so a parser can read that data, check if tags match and do something meaningful with it.

Many people don't like XML's verboseness, so if you don't also, don't worry. You're not alone.

I suppose it is for readability, as mentioned above. However, it violates the DRY principle and thus introduces a source of errors, and of course it bloats your document size, which doubly sucks if you're passing it around over a network, which is a common thing to do these days.

True, you don't need to count closing tags, but that's offset by the risk of errors like this:

<color>red</colour>

Redundant definitions that must always be kept in sync = stress. That's why I pretty much boycott XML (when possible) and choose YAML, which does not suffer from this problem and is otherwise every bit as expressive as XML (minus the DTDs, which in all my years have yet to demonstrate any value to me).

Another alternative is JSON, which similarly avoids this redundancy problem, but JSON lacks internal references, and in any case YAML is a full superset of JSON.

The risk is to get lost in

    ...
    ...
    ...
    </>
   </>
  </>
 </>
</>

BTW, it can validated fine without end-tag names.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow