Question

I'm building a website that requires very basic markup capabilities. I can't use any 3rd party plugins, so I just need a simple way to convert markup to HTML. I might have a total of 3 tags that I'll allow.

What is the best way to convert ==Heading== to <h2>Heading</h2>, or --bold-- to <b>bold</b>? Can this be done simply with Regex, or does somebody have a simple function?

I'm writing this in C#, but examples from other languages would probably work.

Follow up: This is such a small part of my website that I liked the simplicity of using a simple Regex replace. I made this work in C# with the following code:

string html = Regex.Replace("==This will be inside h2==", "==([^=]*)==", "< h2>$1< /h2>")

.NET uses $1 notation instead of the \1 notation that is used in other languages.

Was it helpful?

Solution

It's not really a simple problem, because if you're going to display things back to the user, you'll need to also sanitise the input to ensure you don't create any cross site scripting vulnerabilities.

That said, you could probably do something pretty simple as you describe most easily with a regular expression replacement.

For example

replace the pattern ==([^=]*)== with <h2>\1</h2>

OTHER TIPS

There is also a perl module and a php project to do this. The source code to either could be useful in developing your own solution.

I use Markdown (the same lightweight markup language used on this site). For C# there is a very good bit of source code available here. It fully supports Markdown, although it doesn't appear to be maintained. But for the time being it works really well and it's free open source.

The best part is all the work is done for you if you include this source with your project. It's very small; basically a single method call to transform a chunk of text into HTML.

This really depends on the Wiki syntax you're using as there are several different ones. Obviously the wiki software has this functionality somewhere; if you can't find a software package that does this for you, you could start looking for the relevant code in your wiki software.

Maybe this is what you need.

This page is a compilation of links, descriptions, and status reports of the various alternative MediaWiki parsers — that is, programs and projects, other than MediaWiki itself, which are able or intended to translate MediaWiki's text markup syntax into something else.

Probably overkill for your 3 tags, but if it blows up into a fully-fledged markup language, and the regexp's are beginning to look scary, then you might want to consider antlr

As Joseph said Markdown is the best solution to solve the text to html problem.

MarkdownSharp is lightweight, easy to use and well tested as it is the stackoverflow implementation!

new Markdown().Transform("**markdown text**");

http://blog.stackoverflow.com/2009/12/introducing-markdownsharp/

More about Markdown syntax - http://en.wikipedia.org/wiki/Markdown

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top