Bingbot ignoring robots.txt and attempting to retrieve a trafficbasedsspsitemap.xml [closed]

StackOverflow https://stackoverflow.com/questions/15796035

  •  01-04-2022
  •  | 
  •  

Question

I have an app whose content should not be publicly indexed. I've therefore disallowed access to all crawlers.

robots.txt:

# Robots shouldn't index a private app.
User-agent: *
Disallow: / 

However, Bing has been ignoring this and daily requests a /trafficbasedsspsitemap.xml file, which I have no need to create.

I also have no need to receive daily 404 error notifications for this file. I'd like to just make the bingbot go away, so what do I need to do to forbid it from making requests?

Was it helpful?

Solution

According to this answer, this is Bingbot checking for an XML sitemap generated by the Bing Sitemap Plugin for IIS and Apache. It apparently cannot be blocked by robots.txt.

OTHER TIPS

For those coming from google-

You could block bots via apache user agent detection/ rewrite directives, that would allow you to keep bingbot out entirely. https://superuser.com/questions/330671/wildcard-blocking-of-bots-in-apache

Block all bots/crawlers/spiders for a special directory with htaccess

etc.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top