Bingbot ignoring robots.txt and attempting to retrieve a trafficbasedsspsitemap.xml [closed]

https://stackoverflow.com/questions/15796035

01-04-2022
|

Pergunta

I have an app whose content should not be publicly indexed. I've therefore disallowed access to all crawlers.

robots.txt:

# Robots shouldn't index a private app.
User-agent: *
Disallow: /

However, Bing has been ignoring this and daily requests a /trafficbasedsspsitemap.xml file, which I have no need to create.

I also have no need to receive daily 404 error notifications for this file. I'd like to just make the bingbot go away, so what do I need to do to forbid it from making requests?

Solução

According to this answer, this is Bingbot checking for an XML sitemap generated by the Bing Sitemap Plugin for IIS and Apache. It apparently cannot be blocked by robots.txt.

Outras dicas

For those coming from google-

You could block bots via apache user agent detection/ rewrite directives, that would allow you to keep bingbot out entirely. https://superuser.com/questions/330671/wildcard-blocking-of-bots-in-apache

Block all bots/crawlers/spiders for a special directory with htaccess

etc.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow