Parse XHTML document with undefined entity

Question 1

Entity resolution is done by the underlying parser which is here a standard XmlReader (or XmlTextReader).

Officially, you're supposed to declare entities in DTDs (see Oleg's answer here: Problem with XHTML entities), or load DTDs dynamically into your documents. There are some examples here on SO like this: How do I resolve entities when loading into an XDocument?

What you can also do is create a hacky XmlTextReader derived class that returns Text nodes when entities are detected, based on a dictionary, like I demonstrate here in the following sample code:

using (XmlTextReaderWithEntities reader = new XmlTextReaderWithEntities(MyXmlFile))
{
    reader.AddEntity("nbsp", "\u00A0");
    XDocument xdoc = XDocument.Load(reader);
}

...

public class XmlTextReaderWithEntities : XmlTextReader
{
    private string _nextEntity;
    private Dictionary<string, string> _entities = new Dictionary<string, string>();

    // NOTE: override other constructors for completeness
    public XmlTextReaderWithEntities(string path)
        : base(path)
    {
    }

    public void AddEntity(string entity, string value)
    {
        _entities[entity] = value;
    }

    public override bool Read()
    {
        if (_nextEntity != null)
            return true;

        return base.Read();
    }

    public override XmlNodeType NodeType
    {
        get
        {
            if (_nextEntity != null)
                return XmlNodeType.Text;

            return base.NodeType;
        }
    }

    public override string Value
    {
        get
        {
            if (_nextEntity != null)
            {
                string value = _nextEntity;
                _nextEntity = null;
                return value;
            }
            return base.Value;
        }
    }

    public override void ResolveEntity()
    {
        // if not found, return the string as is
        if (!_entities.TryGetValue(LocalName, out _nextEntity))
        {
            _nextEntity = "&" + LocalName + ";";
        }
        // NOTE: we don't use base here. Depends on the scenario
    }
}

This approach works in simple scenarios, but you may need to override some other stuff for completeness.

PS: sorry it's in C#, you'll have to adapt to VB.NET :)

Question 2

I haven't done this, but you could create a XmlParserContext object with required entity declarations as internalSubset. Pass that context to XmlTextReader in the constructor and create the XDocument object by loading the reader. In MSDN there already is a simple looking example code snippet in VB for using a pre-defined entity.

Question 3

in this case i suppose your taking about of a page on the web so you may use html agility pack which could met your need.

I use xpath, element and more other stuff.It will very usefull to search into an html page etc.

You may find documentation here : htmlagilitypack