@npinti and @Victor's answers are good from the perspective of "regex". However, they are not useful enough when the topic is "using RE to extract locale from HTTP_ACCEPT_LANGUAGE in rails". To detect both 2 chars(eg, "en") and 5 chars(eg, "en-US") format properly in rails:
# accept_language should be something like
# "en-US,en;q=0.8,zh-TW;q=0.6,zh;q=0.4" (from chrome)
# however, it may be nil if the client doesn't set accept language in header.
accept_language = request.env['HTTP_ACCEPT_LANGUAGE'] || ""
# use "match" instead of "scan"!!
match_data = accept_language.match(/^[a-z]{2}(-[A-Z]{2})?/)
I18n.locale = match_data ? match_data[0] : I18n.default_locale