As stated above, ASCII is a subset of Unicode, so the question doesn't quite make sense as-is. If you really want to remove all codepoints below U+0080
from the string, that's easy:
re.sub(r"[\x00-\x7f]+", "", mystring)
If you want to keep only certain "whitelisted" characters, you need to specify precisely which codepoints to keep.
For example, to keep Devanagari codepoints (used for writing Hindi), you can use
re.sub(r"[^\u0900-\u097F]+", "", mystring)
or (Python 2, thanks @bobince for the heads-up!)
re.sub(ur"[^\u0900-\u097F]+", "", mystring)
You do need to make sure that you're working on a Unicode string, so don't forget to decode/encode your input string:
url = 'http://www.bhaskar.com/'
data = urllib2.urlopen(url).read().decode("utf-8-sig")
regex = re.compile(ur"[^\u0900-\u097F]+")
hindionly = regex.sub("foo", data)
print hindionly.encode("utf-8")