I would make some analysis between:
- head > title
- og:metas of the page: $('meta[name="og:title"]).attr('content')
- hN (descending the hierarchy to get the first hN which is the only one on the page)
Then the "analysis" could be as basic as
- trimming
- taking the smallest common string sequence between all 3 choices
Or, you don't mind relying on some saas web service, you could have a look at http://www.diffbot.com/ .