The * character in the Disallow statement of the Robots.txt File

https://stackoverflow.com/questions/12143241

28-06-2021
|

Domanda

How do different search bots interpret the * character in the disallow statement of the robots.txt file? Do all of them treat it as "none, one or more than one character" ?

Let's take the following example:

User-agent: *           
Disallow: /back-end*/*

What does the above code mean? Does it mean that any directory that has "back-end" won't be indexed, even if the word "back-end" is followed by any set of characters? And how about the * after the / ? Is it a good convention to write it?

Generally speaking, my question about the usage of the * in the disallow statement and if all search engine crawlers treats it the same way.

Soluzione

the Robot Exclusion Standard does not mention anything about the * character in the Disallow: statement. Some crawlers like Googlebot and Slurp recognize strings containing * while MSNbot and Teoma interpret it in different ways.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow