Question

I am trying to serve a robots.txt using the Perl Dancer web framework. I thought having a route that just returned the text would work, but it seems to be wrapping it in html and body tags. I'm assuming this won't be interpreted properly as a robots.txt file by crawlers.

Any idea how to do this properly?

Here is how I have the route written:

get '/robots.txt' => sub { return "User-agent: *\nDisallow: /"; };

Thanks in advance!

Was it helpful?

Solution

What makes you think it's being wrapped in HTML and BODY elements?

use Dancer;

get '/robots.txt' => sub {
   return "User-agent: *\nDisallow: /\n";
};

dance;

Output:

>lwp-request -e http://127.0.0.1:3000/robots.txt
200 OK
Server: Perl Dancer 1.3112
Content-Length: 26
Content-Type: text/html
Client-Date: Mon, 29 Apr 2013 05:05:32 GMT
Client-Peer: 127.0.0.1:3000
Client-Response-Num: 1
X-Powered-By: Perl Dancer 1.3112

User-agent: *
Disallow: /

I bet you're vieweing it with a client that uses a renderer that adds those on seeing the Content-Type header of text/html. Setting the content type to text/plain would be more appropriate and look better in the renderer you are using to view the file.

get '/robots.txt' => sub {
   content_type 'text/plain';
   return "User-agent: *\nDisallow: /\n";
};

Ultimately, though, it shouldn't have any effect whatsoever.

OTHER TIPS

The other option for sending robots.txt would be to not define a route for it and instead put an actual robots.txt file into the public/ subdirectory under your main Dancer app directory. Dancer will then serve it automatically as a regular file without passing it through the route handlers, templates, etc.

You are serving the response as text/html (the default). The elements are being inserted by the browser as part of the normal process of parsing HTML (and you are looking at a representation of the live DOM rather than the source code).

Set the correct content-type header.

get '/robots.txt' => sub {
  content_type "text/plain";
  return "User-agent: *\nDisallow: /";
};
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top