Eclipse RAP crawling by Google?

https://stackoverflow.com/questions/11138321

16-06-2021
|

Question

I'm developing a RAP website and want to make it accessible for robots like Google (I'm only referring to Google here; but I think other search engines work similar). The Webapp contains dynamic content which I load from a Database, depending on what the user is searching for. How do I make this content available for Google? I've read Google's guide for ajax crawling, but don't know how apply it for RAP:

RAP makes the AJAX calls 'internally'. Can I use them for Google; and how?
RAP is a single page, how should I provide a Sitemap (XML) to Google?

Thanks in advance!

Solution

The idea of Ajax is that an application doesn't load new pages all the time, but loads new pieces of content using Ajax requests in the background. To provide “deep links” into your application anyway, you need URLs that contain a fragment part, such as example.com/myapp#mystate. This trick is used because a browser does not reload the page when only the fragment part of the URL changes.

This is no different with RAP. To deal with this kind of URLs, RWT provides a browser history API. When the state of your application changes, e.g. when the user selects some kind of tab or fires up a search, you can add a new entry to the browser history, which effectively changes the fragment of the URL in the browser:

RWT.getBrowserHistory().createEntry( "!mystate", "Example" );

This will change the URL to example.com/app/entrypoint#!mystate (the “deep link” to this state) and adds an entry named "Example" to the browser history, so you can use the browser's back button to get back to this state later.

To be able to react on changes of the URL, you have to add a listener to the browser history. This listener will be notified every time the fragment part changes. This is also the case when application is started with a fragment (someone follows a deep link). Your application is then responsible to re-install the state that is represented by this fragment.

RWT.getBrowserHistory().addBrowserHistoryListener( new BrowserHistoryListener() {
  public void navigated( BrowserHistoryEvent event ) {
    // show state represented by event.entryId
  }
} );

An example for a RAP application that uses fragment URLs for different “sub-pages” is the RAP examples demo.

The rest of the story should be explained in Google's AJAX crawling guide. Your ids have to start with a ! to produce URLs with a fragment like #!mystate. These URLs are the ones that you should add to your sitemap. To feed the crawlers, you could implement a servlet filter that catches requests to the URL pattern ?_escaped_fragment_=mystate and returns an HTML representation of the particular state.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow