Question

After setting my Google Webmaster account and verified my website, i failed to add my sitemap to it. It was issuing the following error.

enter image description here

I tried to do the following:

  1. I removed the robots.txt and still didn't work.
  2. I tried to verify my sitemap on http://www.validome.org/google/validate and it got reported as valid.
  3. I checked the sitemap and my URL several times for errors and everything seemed to be alright.

For Reference:

My sitemap.xml is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<urlset
      xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
            http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<url>
  <loc>http://www.myDomain.com/</loc>
  <changefreq>daily</changefreq>
  <priority>1.00</priority>
</url>
<url>
  <loc>http://www.myDomain.com/about/</loc>
  <changefreq>daily</changefreq>
  <priority>0.90</priority>
</url>
<url>
  <loc>http://www.myDomain.com/help.php</loc>
  <changefreq>daily</changefreq>
  <priority>0.90</priority>
</url>
</urlset>

My Robots.txt is as follows:

User-agent: ia_archiver
Disallow: /

User-agent: duggmirror
Disallow: /

User-agent: *
Disallow: /cgi-bin/

Sitemap: http://www.myDomain.com/sitemap.xml
Was it helpful?

Solution 3

It turned out the problem was from Google servers. I didn't change anything, just left the whole topic for a week and tried again. Everything seems to be working fine now.

Sometimes due to some major updates or Google busy servers, the acceptance of sitemaps takes more time that the usual rates. For those people who are facing similar problems, just wait for some days and give it another shot before complaining in Google forums and such.

OTHER TIPS

Here is some nice reference for you problem with htaccess rules against bot :

http://www.widexl.com/tutorials/htaccess.html

http://www.howtoforge.com/forums/showthread.php?t=27809

If you want something really effective for your need :

  • Replace your robots.txt rules by this one :

    User-agent: *
    Allow: /
    Sitemap: http://www.myDomain.com/sitemap.xml
    
  • And add this to your .htaccess

    #Block ia_archiver & duggmirror
        RewriteCond %{HTTP_USER_AGENT} .*.ia_archiver|duggmirror* [NC]
        RewriteRule .* - [F]
    
    
    #Block cgi access
        <filesMatch "^php5?\.(ini|cgi)$">
            Order Deny,Allow
            Deny from All
            Allow from env=REDIRECT_STATUS
        </filesMatch>
    
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top