Any parsers for RFC documents? [closed]
Question
RFCs (http://www.ietf.org/rfc.html) are usually published as text files.
- Are there any other formats, which would make parsing the RFC content easier?
- Are there any parsers for the widely used RFC text documents?
Solution
A limited number of RFCs are offerd as XML at http://xml.resource.org/public/rfc/xml/
Also you could merge the text data using Bib XML from http://xml.resource.org/public/rfc/bibxml/
OTHER TIPS
IETF maintains minmally-marked-up RFCs in HTML, for example:
http://tools.ietf.org/html/rfc2616.html
but the markup consists mostly of anchors to implement a table of contents; and main-body markup that is mostly <pre> ... </pre>. Nevertheless, it might be possible to do some meaningful parsing on those RFCs.
W3C has some HTMLized RFCs, for example:
http://www.w3.org/Protocols/rfc2616/rfc2616.html
in which the markup is somewhat richer in its semantics and so perhaps more amenable to parsing.