Question

My code uses Tidy.NET to "clean" HTML documents. In some cases, the returned HTML is empty, and I don't know why.

The variable messages contains a message collection, and its property Count has the value 2 after processing the HTML. Despite knowing that, I don't know how to actually view the error messages.

This is the code:

        var tidy = new Tidy();

        var input = new MemoryStream();
        var output = new MemoryStream();

        byte[] byteArray = Encoding.UTF8.GetBytes(html);
        input.Write(byteArray, 0, byteArray.Length);
        input.Position = 0;

        var messages = new TidyMessageCollection();

        tidy.Parse(input, output, messages);

        html = Encoding.UTF8.GetString(output.ToArray());

What I need to do to know what's going on?

Was it helpful?

Solution

I found a way. You must iterate through the message collection. Info messages and warnings are also added to the list, so you have to check the Level property get only errors (or warnings, whatever you want).

foreach (TidyMessage message in messages) 
{
    if (message.Level == MessageLevel.Error) 
    {
        // error handling here
    }
}

OTHER TIPS

I had the same problem today and it was solved modifying the source code of TidyNet.

In the class TidyMessageCollection i was created a public property called MessageLists that exposes the protected InnerLists with the parse errors:

In Tidy.TidyMessageCollection

public ArrayList MessageList
{
    get { return InnerList; }
}

Now, you can read all the error messages after the Parsecall outside the Tidy project like this:

Tidy tidy = new Tidy();    
TidyMessageCollection tmc = new TidyMessageCollection();
MemoryStream input = new MemoryStream();
MemoryStream output = new MemoryStream();

tidy.Parse(input, output, tmc);

//Same code than you
foreach(TidyMessage message in tmc.MessageList)
if (message.Level == MessageLevel.Error) 
{
    // error handling here
}

Hit the same issue today but not really keen on modifying the source and maintaining a copy of it so here is my solution in one line.

var tidyErrors = (from TidyMessage msg in tmc where msg.Level == MessageLevel.Error select msg.Message).ToList();

I hope this helps someone else.

Simon

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top