Question

I am looking into errors we are receiving about once an hour on an app we currently have in a production setting. It looks like there is a crawler sending us an incorrectly formatted utf8 parameter. This is the exception we are seeing:

ArgumentError: invalid byte sequence in UTF-8

This is the user agent we are seeing on every request:

Mozilla/5.0 (compatible; GrapeshotCrawler/2.0; +http://www.grapeshot.co.uk/crawler.php)

We are receiving the following parameters as reported by Airbrake:

{
  "action": "index",
  "controller": "cards",
  "format": "",
  "utf8": "â?/u201C"
}

Backtrace:

/usr/local/rvm/rubies/ruby-1.9.3-p392/lib/ruby/1.9.1/cgi/util.rb:7 in "gsub"
/usr/local/rvm/rubies/ruby-1.9.3-p392/lib/ruby/1.9.1/cgi/util.rb:7 in "escape"
/gems/activesupport-3.2.12/lib/active_support/core_ext/object/to_query.rb:10 in "to_query"
/gems/activesupport-3.2.12/lib/active_support/core_ext/object/to_param.rb:52 in "block in to_param"
/gems/activesupport-3.2.12/lib/active_support/core_ext/object/to_param.rb:51 in "each"
/gems/activesupport-3.2.12/lib/active_support/core_ext/object/to_param.rb:51 in "collect"
/gems/activesupport-3.2.12/lib/active_support/core_ext/object/to_param.rb:51 in "to_param"
/gems/actionpack-3.2.12/lib/action_dispatch/http/url.rb:47 in "url_for"
/gems/actionpack-3.2.12/lib/action_dispatch/routing/route_set.rb:591 in "url_for"
/gems/actionpack-3.2.12/lib/action_dispatch/routing/url_for.rb:148 in "url_for"
/gems/actionpack-3.2.12/lib/action_controller/caching/actions.rb:172 in "initialize"
/gems/actionpack-3.2.12/lib/action_controller/caching/actions.rb:141 in "new"
/gems/actionpack-3.2.12/lib/action_controller/caching/actions.rb:141 in "filter"
/gems/activesupport-3.2.12/lib/active_support/callbacks.rb:321 in "around"
/gems/activesupport-3.2.12/lib/active_support/callbacks.rb:310 in "_callback_around_375"
/gems/activesupport-3.2.12/lib/active_support/callbacks.rb:214 in "_conditional_callback_around_1195"
/gems/activesupport-3.2.12/lib/active_support/callbacks.rb:414 in "_run__3684532395506755200__process_action__4186607910431134588__callbacks"
/gems/activesupport-3.2.12/lib/active_support/callbacks.rb:405 in "__run_callback"
/gems/activesupport-3.2.12/lib/active_support/callbacks.rb:385 in "_run_process_action_callbacks"
/gems/activesupport-3.2.12/lib/active_support/callbacks.rb:81 in "run_callbacks"
/gems/actionpack-3.2.12/lib/abstract_controller/callbacks.rb:17 in "process_action"
/gems/actionpack-3.2.12/lib/action_controller/metal/rescue.rb:29 in "process_action"
/gems/actionpack-3.2.12/lib/action_controller/metal/instrumentation.rb:30 in "block in process_action"
/gems/activesupport-3.2.12/lib/active_support/notifications.rb:123 in "block in instrument"
/gems/activesupport-3.2.12/lib/active_support/notifications/instrumenter.rb:20 in "instrument"
/gems/activesupport-3.2.12/lib/active_support/notifications.rb:123 in "instrument"
/gems/actionpack-3.2.12/lib/action_controller/metal/instrumentation.rb:29 in "process_action"
/gems/actionpack-3.2.12/lib/action_controller/metal/params_wrapper.rb:207 in "process_action"
/gems/activerecord-3.2.12/lib/active_record/railties/controller_runtime.rb:18 in "process_action"
/gems/newrelic_rpm-3.6.0.78/lib/new_relic/agent/instrumentation/rails3/action_controller.rb:38 in "block in process_action"
/gems/newrelic_rpm-3.6.0.78/lib/new_relic/agent/instrumentation/controller_instrumentation.rb:272 in "block in perform_action_with_newrelic_trace"
/gems/newrelic_rpm-3.6.0.78/lib/new_relic/agent/method_tracer.rb:235 in "trace_execution_scoped"
/gems/newrelic_rpm-3.6.0.78/lib/new_relic/agent/instrumentation/controller_instrumentation.rb:267 in "perform_action_with_newrelic_trace"
/gems/newrelic_rpm-3.6.0.78/lib/new_relic/agent/instrumentation/rails3/action_controller.rb:37 in "process_action"
/gems/actionpack-3.2.12/lib/abstract_controller/base.rb:121 in "process"
/gems/actionpack-3.2.12/lib/abstract_controller/rendering.rb:45 in "process"
/gems/actionpack-3.2.12/lib/action_controller/metal.rb:203 in "dispatch"
/gems/actionpack-3.2.12/lib/action_controller/metal/rack_delegation.rb:14 in "dispatch"
/gems/actionpack-3.2.12/lib/action_controller/metal.rb:246 in "block in action"
/gems/actionpack-3.2.12/lib/action_dispatch/routing/route_set.rb:73 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/routing/route_set.rb:73 in "dispatch"
/gems/actionpack-3.2.12/lib/action_dispatch/routing/route_set.rb:36 in "call"
/gems/journey-1.0.4/lib/journey/router.rb:68 in "block in call"
/gems/journey-1.0.4/lib/journey/router.rb:56 in "each"
/gems/journey-1.0.4/lib/journey/router.rb:56 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/routing/route_set.rb:601 in "call"
/gems/rack-pjax-0.7.0/lib/rack/pjax.rb:12 in "call"
/gems/newrelic_rpm-3.6.0.78/lib/new_relic/rack/error_collector.rb:12 in "call"
/gems/newrelic_rpm-3.6.0.78/lib/new_relic/rack/error_collector.rb:12 in "call"
/gems/newrelic_rpm-3.6.0.78/lib/new_relic/rack/agent_hooks.rb:18 in "call"
/gems/newrelic_rpm-3.6.0.78/lib/new_relic/rack/browser_monitoring.rb:16 in "call"
/gems/warden-1.2.1/lib/warden/manager.rb:35 in "block in call"
/gems/warden-1.2.1/lib/warden/manager.rb:34 in "catch"
/gems/warden-1.2.1/lib/warden/manager.rb:34 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/middleware/best_standards_support.rb:17 in "call"
/gems/rack-1.4.5/lib/rack/etag.rb:23 in "call"
/gems/rack-1.4.5/lib/rack/conditionalget.rb:25 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/middleware/head.rb:14 in "call"
/gems/remotipart-1.0.5/lib/remotipart/middleware.rb:30 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/middleware/params_parser.rb:21 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/middleware/flash.rb:242 in "call"
/gems/rack-1.4.5/lib/rack/session/abstract/id.rb:210 in "context"
/gems/rack-1.4.5/lib/rack/session/abstract/id.rb:205 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/middleware/cookies.rb:341 in "call"
/gems/activerecord-3.2.12/lib/active_record/query_cache.rb:64 in "call"
/gems/activerecord-3.2.12/lib/active_record/connection_adapters/abstract/connection_pool.rb:479 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/middleware/callbacks.rb:28 in "block in call"
/gems/activesupport-3.2.12/lib/active_support/callbacks.rb:405 in "_run__4006189938838080721__call__2271109139271149174__callbacks"
/gems/activesupport-3.2.12/lib/active_support/callbacks.rb:405 in "__run_callback"
/gems/activesupport-3.2.12/lib/active_support/callbacks.rb:385 in "_run_call_callbacks"
/gems/activesupport-3.2.12/lib/active_support/callbacks.rb:81 in "run_callbacks"
/gems/actionpack-3.2.12/lib/action_dispatch/middleware/callbacks.rb:27 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/middleware/remote_ip.rb:31 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/middleware/debug_exceptions.rb:16 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/middleware/show_exceptions.rb:56 in "call"
/gems/railties-3.2.12/lib/rails/rack/logger.rb:32 in "call_app"
/gems/railties-3.2.12/lib/rails/rack/logger.rb:16 in "block in call"
/gems/activesupport-3.2.12/lib/active_support/tagged_logging.rb:22 in "tagged"
/gems/railties-3.2.12/lib/rails/rack/logger.rb:16 in "call"
/gems/actionpack-3.2.12/lib/action_dispatch/middleware/request_id.rb:22 in "call"
/gems/rack-1.4.5/lib/rack/methodoverride.rb:21 in "call"
/gems/rack-1.4.5/lib/rack/runtime.rb:17 in "call"
/gems/activesupport-3.2.12/lib/active_support/cache/strategy/local_cache.rb:72 in "call"
/gems/rack-1.4.5/lib/rack/lock.rb:15 in "call"
/gems/rack-cache-1.2/lib/rack/cache/context.rb:136 in "forward"
/gems/rack-cache-1.2/lib/rack/cache/context.rb:143 in "pass"
/gems/rack-cache-1.2/lib/rack/cache/context.rb:172 in "rescue in lookup"
/gems/rack-cache-1.2/lib/rack/cache/context.rb:168 in "lookup"
/gems/rack-cache-1.2/lib/rack/cache/context.rb:66 in "call!"
/gems/rack-cache-1.2/lib/rack/cache/context.rb:51 in "call"
/gems/utf8-cleaner-0.0.6/lib/utf8-cleaner/middleware.rb:18 in "call"
/gems/railties-3.2.12/lib/rails/engine.rb:479 in "call"
/gems/railties-3.2.12/lib/rails/application.rb:223 in "call"
/gems/railties-3.2.12/lib/rails/railtie/configurable.rb:30 in "method_missing"
/gems/unicorn-4.6.2/lib/unicorn/http_server.rb:552 in "process_client"
/gems/unicorn-4.6.2/lib/unicorn/http_server.rb:632 in "worker_loop"
/gems/newrelic_rpm-3.6.0.78/lib/new_relic/agent/instrumentation/unicorn_instrumentation.rb:22 in "call"
/gems/newrelic_rpm-3.6.0.78/lib/new_relic/agent/instrumentation/unicorn_instrumentation.rb:22 in "block (4 levels) in <top (required)>"
/gems/unicorn-4.6.2/lib/unicorn/http_server.rb:500 in "spawn_missing_workers"
/gems/unicorn-4.6.2/lib/unicorn/http_server.rb:142 in "start"
/gems/unicorn-4.6.2/bin/unicorn:126 in "<top (required)>"
/bin/unicorn:23 in "load"
/bin/unicorn:23 in "<main>"

Despite my efforts I have been unable to reproduce the issue. I'm wondering if anyone else has ran into this issue before and if there is anything that can be done to sanitize the request before rack errors.

Was it helpful?

Solution

There is the utf8-cleaner gem, however it only handles incorrectly % encoded strings. It will not clean strings which are unencoded utf-8 characters.

So I forked his gem (https://github.com/lulalala/utf8-cleaner), and change the semantic to respond with 400 error if request url contains not-%-encoded utf-8 characters. You could fork mine version again to sanitize instead of responding with 400.

(but in my opinion sanitized request is never useful, it is better to respond with 400)

OTHER TIPS

I managed to fix it (on a Rails 3.2.18 app) as described in this gist:

https://gist.github.com/joost/ca4eda8f31655cf6095a

It also returns an error 400 just by adding one little middleware file to your Rails app.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top