Question

What is the difference between encoding and entity references in xml ?

Was it helpful?

Solution

Encoding refers to the way a character is represented by a sequence of bytes. It happens at a pretty low level in the processing chain: you read in the bytes and use the encoding to convert to a stream of characters. ASCII, Latin-1, and UTF-8 are all examples of encodings.

Entity references are handled by the XML parser itself. A sequence of characters, starting with & and ending with ;, is used to represent a different sequence of characters (usually just one). This happens at a fairly high level, conceptually "after" the XML parser has determined where tags are. This is why < turns into a plain old less than sign, not the beginning of a tag.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top