Is it necessary to add missing closing tags as part of sanitizing the HTML to prevent XSS attacks?

StackOverflow https://stackoverflow.com/questions/7069848

Вопрос

I'm using the Sanitize gem to disallow HTML code that could be used for an XSS attack. As a side effect, the HTML also gets cleaned up. Missing closing tags get added. This would normally be fine but in many cases it changes the formatting of the content. Ultimately, i would like to cleanup the HTML entirely but don't want to have to do this as part of securing the site against XSS.

So, are missing end tags (e.g. </font>) a potential XSS exploit? If not, how do i stop Sanitizer from trying to clean up the HTML too?

Это было полезно?

Решение

Sanitize is built on top of Nokogiri:

Because it’s based on Nokogiri, a full-fledged HTML parser, rather than a bunch of fragile regular expressions, Sanitize has no trouble dealing with malformed or maliciously-formed HTML, and will always output valid HTML or XHTML.

Emphasis mine. So the answer is "no", you have to fix your broken HTML.

Nokogiri has to fix the HTML so that it can be properly interpreted and a DOM can be built, then Sanitize will modify the DOM that Nokogiri builds, and finally that modified DOM will be serialized to get the HTML that you get to store.

If you scan through the Sanitize source, you'll see that everything ends up going through clean! and that will use Nokogiri's to_html or to_xhtml methods:

if @config[:output] == :xhtml
  output_method = fragment.method(:to_xhtml)
  output_method_params[:save_with] = Nokogiri::XML::Node::SaveOptions::AS_XHTML
elsif @config[:output] == :html
  output_method = fragment.method(:to_html)
else
  raise Error, "unsupported output format: #{@config[:output]}"
end

result = output_method.call(output_method_params)

So you get Nokogiri's version of the HTML, not simply your HTML with the bad parts removed.

Другие советы

Perhaps you can configure sanitize as demonstrated in the documentation:

By default, Sanitize removes all HTML. You can use one of the built-in configs to tell Sanitize to allow certain attributes and elements:

Sanitize.clean(html, Sanitize::Config::RESTRICTED)
# => '<b>foo</b>'

Sanitize.clean(html, Sanitize::Config::BASIC)
# => '<b><a href="http://foo.com/" rel="nofollow">foo</a></b>'

Sanitize.clean(html, Sanitize::Config::RELAXED)
# => '<b><a href="http://foo.com/">foo</a></b><img

src="http://foo.com/bar.jpg" />'

Or, if you’d like more control over what’s allowed, you can provide your own custom configuration:

Sanitize.clean(html, :elements => ['a', 'span'],
    :attributes => {'a' => ['href', 'title'], 'span' =>

['class']}, :protocols => {'a' => {'href' => ['http', 'https', 'mailto']}})

Quoted from wonko.com

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top