Question

I'm running a Wordpress blog on Nginx and Varnish. I'm using the following configuration for Varnish:

# This is a basic VCL configuration file for varnish.  See the vcl(7)
# man page for details on VCL syntax and semantics.
# 
# Default backend definition.  Set this to point to your content
# server.
# 
backend default {
    .host = "127.0.0.1";
    .port = "8080";
    .connect_timeout = 600s;
    .first_byte_timeout = 600s;
    .between_bytes_timeout = 600s;
    .max_connections = 800;
}


acl purge {
        "localhost";
}

sub vcl_recv {
    set req.grace = 2m;

  # Set X-Forwarded-For header for logging in nginx
  remove req.http.X-Forwarded-For;
  set    req.http.X-Forwarded-For = client.ip;


  # Remove has_js and CloudFlare/Google Analytics __* cookies.
  set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(_[_a-z]+|has_js)=[^;]*", "");
  # Remove a ";" prefix, if present.
  set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");



# Either the admin pages or the login
if (req.url ~ "/wp-(login|admin|cron)") {
        # Don't cache, pass to backend
        return (pass);
}


# Remove the wp-settings-1 cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-1=[^;]+(; )?", "");

# Remove the wp-settings-time-1 cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-time-1=[^;]+(; )?", "");

# Remove the wp test cookie
set req.http.Cookie = regsuball(req.http.Cookie, "wordpress_test_cookie=[^;]+(; )?", "");

# Static content unique to the theme can be cached (so no user uploaded images)
# The reason I don't take the wp-content/uploads is because of cache size on bigger blogs
# that would fill up with all those files getting pushed into cache
if (req.url ~ "wp-content/themes/" && req.url ~ "\.(css|js|png|gif|jp(e)?g)") {
    unset req.http.cookie;
}

# Even if no cookies are present, I don't want my "uploads" to be cached due to their potential size
if (req.url ~ "/wp-content/uploads/") {
    return (pass);
}

# Check the cookies for wordpress-specific items
if (req.http.Cookie ~ "wordpress_" || req.http.Cookie ~ "comment_") {
        # A wordpress specific cookie has been set
    return (pass);
}



    # allow PURGE from localhost
    if (req.request == "PURGE") {
        if (!client.ip ~ purge) {
            error 405 "Not allowed.";
        }
        return (lookup);
    }


    # Force lookup if the request is a no-cache request from the client
    if (req.http.Cache-Control ~ "no-cache") {
        return (pass);
    }


# Try a cache-lookup
return (lookup);

}

sub vcl_fetch {
    #set obj.grace = 5m;
    set beresp.grace = 2m;

}

sub vcl_hit {
        if (req.request == "PURGE") {
                purge;
                error 200 "Purged.";
        }
}

sub vcl_miss {
        if (req.request == "PURGE") {
                purge;
                error 200 "Purged.";
        }
}

I've followed the tutorial mentioned here

Everything works fine, but I'm using Yoast SEO Plugin to generate the Sitemap dynamically after every new post. It generates a sitemap index named sitemap_index.xml that contains other sitemaps (for posts, pages, author etc). This is also working fine.

  1. The problem is how can I prevent Varnish from caching my Sitemaps?
  2. How can I prevent Varnish from messing with Google Analytics? It shouldn't stop GA from providing me with a correct report.

I'm new to Varnish, can someone please guide me on how to modify the config. :( Please help.

UPDATE:

Will it work if I include the following into sub vcl_recv

if (req.url ~ "\.xml(\.gz)?$") {
   return (pass);
}
Was it helpful?

Solution

PLEASE remove these lines !!

if (req.url ~ "\.xml(\.gz)?$") {
   return (pass);
}

Returning (pass) is a workaround but it's not how you want to use Varnish. Varnish is here to cache pages and contents like sitemap_index.xml

You already implemented PURGE mechanism in VCL, so the simplest way to handle your sitemap_index.xml issue is to PURGE it !

The basic principle is that sitemap_index.xml need to be cached as long as no new post has been made. Then, every time a new post is created, you have to inform Varnish that sitemap_index.xml is no longer valid by sending the HTTP request below (pasted from official documentation (1)) :

PURGE /sitemap_index.xml HTTP/1.0
Host: example.com

So, I guess you will have the choice by editing your module manually or by using the Varnish HTTP Purge / WordPress module (and probably hack it manually also) (2)

  1. https://www.varnish-cache.org/docs/3.0/tutorial/purging.html#http-purges

  2. http://wordpress.org/plugins/varnish-http-purge/

OTHER TIPS

Will it work if I include the following into sub vcl_recv

if (req.url ~ ".xml(.gz)?$") { return (pass); }

This will work. Place it near the top of the function. Keep in mind though, that it will prevent caching of all .xml files and all .xml.gz files. Granted, most of the xml files and xml.gz files you are probably serving, site maps, still it is a consideration, in case they are not.

i can`t give you the exact syntax, but you should pipe* the request for the sitemap.

*pipe - match the request in your vcl and direct it always to fetch it from the server.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top