Question

I have code in my page that is activated by this (I can output its value in a comment in the header): isset($_GET['_escaped_fragment_'])

and I'm looking at the source of 'what scraper sees' using this tool https://developers.facebook.com/tools/debug

and my URL has #! shebang in it.

Still, one of the sites I'm testing receives the _escaped_fragment_ (Facebook visits using ?_escaped_fragment_= in the URL), while on another it doesn't.

I don't think it has anything to do with what's on the page (og metas) since it determines whether or not to rewrite #! to ?_escaped_fragment_= before even loading the URL.

Can someone enlighten me what's required to make this feature work?

Was it helpful?

Solution

It IS because of the meta og:url / link rel=canonical. I've found that Facebook 'what scraper sees' presents you the final result, not the 'first crawl' that you'd expect. So FB crawler goes on to the page, sees a meta tag with og:url or most importantly the link rel=canonical. It then stops crawling the page and goes to the URL specified. Then it presents you the source of that URL, and that doesn't have the shebang in it. It's all logical but I didn't count that it does this 'hidden redirect' or bounce behind the scenes. The solution is to filter out / remove meta og:url and link rel=canonical from the head, that's about it. Several WP plugins add these by the way.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top