Can search engines index JavaScript generated web pages?

https://stackoverflow.com/questions/826275

05-07-2019
|

Question

Can search engines such as Google index JavaScript generated web pages? When you right click and select view source in a page that is generated by JavaScript (e.g using GWT) you do not see the dynamically generated HTML. I suppose that if a search engine also cannot see the generated HTML then there is not much to index, right?

Solution

Your suspicion is correct - JS-generated content cannot be relied on to be visible to search bots. It also can't be seen by anyone with JS turned off - and, last time I added some tests to a site I was working on (which was a large, mainstream-audience site, with hundreds of thousands of unique vistors per month), approx 10% of users were not running Javascript in any form. That includes search bots, PC browsers with JS disabled, many mobiles, blind people using screenreaders... etc etc.

This is why content generated via JS (with no fallback option) is a Really Bad Idea.

Back to basics. First, create your site using bare-bones (X)HTML, on REST-like principles (at least to the extent of requiring POST requests for state changes). Simple semantic markup, and forget about CSS and Javascript.

Step one is to get that right, and have your entire site (or as much of it as makes sense) working nicely this way for search bots and Lynx-like user agents.

Then add a visual layer: CSS/graphics/media for visual polish, but don't significantly change your original (X)HTML markup; allow the original text-only site to stay intact and functioning. Keep your markup clean!

Third is to add a behavioural layer: Javascript (Ajax). Offer things that make the experience faster, smoother, nicer for users/browsers with Ajax-capable JS... but only those users. Users without Javascript are still welcome; and so are search bots, the visually impaired, many mobiles, etc.

This is called progressive enhancement in web design circles. Do it this way and your site works, in some reasonable form, for everyone.

OTHER TIPS

if a search engine also cannot see the generated HTML then there is not much to index

That about sums it up. Technically nothing is stopping a search engine from implementing a javascript engine for their bot/spider, but it's just not normally done. They could, but they won't.

On the other hand, you can sniff a search engine's user agent and serve it something readable. But search engines don't usually like this and will penalize you pretty severely if they detect differences with what you send to a normal browser.

A good rule of thumb: If you can see it in Lynx, it can indexed by Google.

Lynx is an excellent test because it also gives you an idea of how screen readers for the blind will see your page as well.

Yes, Google (and most likely Bing) will index dynamically generated HTML. See more details here: http://searchengineland.com/tested-googlebot-crawls-javascript-heres-learned-220157.

Google is working on executing simple Javascript to uncover some content - but they certainly dont execute full scripts. If you are worried about SEO, then you need to consider providing static versions of pages.

There are a few ways to handle this in GWT, this is a great discussion on the subject. Seems like the best option is to serve up static SEO content when the user-agent is a bot, as long as the SEO content is identical to what is served via the GWT route. This can be a lot of work, but if you really want a fully rich GWT app that is optimized for search engines it may be worth it.

Take a look to the Single Page Interface Manifesto of how a SPI (AJAX intensive) application can get indexed by Google and other crawlers. How hard is depends on the web framework used.

Even if they execute the basic JavaScript MOSTLY WEBSITES USES LIBRARIES AND FRAMEWORKS , I doesn't think so that a bot like google bot or any other spider will also load Js files linked with webpage and without loading them the JS code will produce errors.

/*Correct Me If I am wrong*/

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow