Algorithm or existing API to clip/summarize article for sharing online (identify title, images and text)? [closed]

StackOverflow https://stackoverflow.com/questions/9249706

Question

I would like to do something similar to what Facebook is doing when you add / share an article

e.g. by typing a URL of an article / story, Facebook is automatically identifying title image etc.

Is there an existing Algorithm or standard for this? Is there any commercial or open API that does this?

A related SO question How to grab the title + images of a link when posting sharing a link, has a great suggestion to simply find the first <h1> and <img> but I was wondering if there is an API that handles situations where the HTML author is not that friendly (e.g. image is in background, title is not in h1 but in h2 (or using a style class only). I will test how Facebook handles such pages and update the question.

Was it helpful?

Solution

Is there an existing Algorithm or standard for this? Is there any commercial or open API that does this?

Yes, for Facebook, they're complying with the Open Graph Protocol standards. Read more about the Open Graph Protocol here at http://ogp.me

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top