Question

In my C# project, I have been dealt with the task of parsing an SGML file and have tried, very naively, to use XmlReader, and this has led to some interesting revelations (i.e., the difference between SGML and well-formed XML, etc.)

So I am thinking that I just need a good SGML parser which converts it to an XML file and go from there. In my search, I have found two SGML parsers that can integrate with my C# project:

Any other recommendations?

Was it helpful?

Solution

Apparently SgmlReader's updated here:

https://github.com/MindTouch/SGMLReader

OTHER TIPS

HTML is an implementation of SGML. If you want to parse HTML properly, you will need an SGML parser. SGMLreader appears to fit those needs well, and I plan to use it myself. I would suggest using HTML tidy. It is a native application, but .net bindings for it do exist. If you need entirely managed code, then the SGMLreader is the way to go.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top