EPiServer  /  CMS April 12, 2010

The EPiServer CMS Output Cache Explained

A while back I wrote a post explaining how EPiServer CMS caches PageData objects. Following that post I’ve gotten a few questions regarding how EPiServer CMS’s output cache works. That’s something I haven’t given much thought before so I decided to investigate it. First of all, let’s define what we’re talking about. EPiServer CMS offers two types of caching to speed up sites and lower the load on the database server, object caching and output caching.

Output caching and object caching

By object caching, or as some like to call it, data caching, I mean caching of objects such as instances of classes such as PageData, PageType and just about any other class. Looking at EPiServer sites our primary concern is how EPiServer caches calls made to methods in the DataFactory class, such as GetPage() and GetChildren(). To read up on that you should take a look at the aforementioned post. Per default when installing a new EPiServer CMS site with the public templates this type of caching is turned on.

By output caching I mean caching of fully processed pages or user controls. In other words the HTML content that an ASPX or ASCX has generated once it has been rendered. When installing a new EPiServer CMS site with the public templates this type of caching is turned off.

ASP.NET’s caching functionality

ASP.NET has built in support for both types of caching. For object caching we can programmatically cache objects using the Cache class. To output cache an ASPX page we can add an OutputCache directive to it. The below line will ensure that the output from the page is cached for 60 minutes.

<%@ OutputCache Duration="60" %>

There are some limitations to the built in caching in ASP.NET though as it doesn’t know the specifics of our application. Therefore EPiServer CMS has it’s own caching functionality that extends the ASP.NET’s caching functionality. Again, for an explanation of object caching, which I think it the most important type, see the aforementioned post as this post focuses on output caching.

The limitations of ASP.NET’s output caching

When we uses EPiServer CMS we usually use a single ASPX page to render the unique content of many different pages (PageData objects). If we would just use the OutputCache directive shown above our ASPX would always render the same content until the cache expires. To fix that we could tell it to consider certain query string parameters as unique keys using the VaryByParam attribute.

<%@ OutputCache Duration="60" VaryByParam="id,epslanguage" %>

Since the unique identifier for a specific version of a page in a specific language is the combination of the id and epslanguage query string parameters the above line, which still just uses the built in caching functionality in ASP.NET, would make sure that each individual EPiServer CMS page rendered by our ASPX is cached separately.

But what happens if an editor updates a page? Or if a page is unpublished before the cache expires? ASP.NET’s cache have no way of knowing about that and the old content, or the content of the unpublished page, will remain in the cache and be displayed to visitors until the cache expires. For ASPX pages (but not for user controls) EPiServer CMS offers a solution to this problem.

EPiServer CMS’s output caching

If we create an ASPX page that inherits from EPiServer’s PageBase class or any of it’s subclasses such as EditPage or TemplatePage a method named SetCachePolicy() in PageBase will be invoked on the Init event. The SetCachePolicy method, which is virtual so that we can easily override it, will programmatically tell the ASP.NET runtime to cache the output of the page, just like the OutputCache directive would have. It will also instruct the runtime to vary the cache depending on query string parameters, just like the VaryByParam attribute. We can configure which parameters it will vary by and the cache duration in web.config for a EPiServer CMS 5 site and in the episerver.config file for an EPiServer CMS 6 site under in the configuration/episerver/sites/site/siteSetting node with the httpCacheVaryByParams and httpCacheExpiration attributes. By default the httpCacheVaryByParams is set to “id,epslanguage” and the httpCacheExpiration is set to “00:00:00” (disabling output caching). Like this:

<configuration>  
  <episerver xmlns="http://EPiServer.Configuration.EPiServerSection">
    <sites>
      <site description="Example Site" siteId="unknown">
        <siteSettings 
          httpCacheVaryByParams="id,epslanguage" 
          httpCacheExpiration="00:00:00"/>
      </site>
    </sites>
  </episerver>
</configuration>

So far this isn’t anything we could have done using the OutputCache directive. But the SetCachePolicy method also does a few other things. First of all it only caches the output if the request is a GET request and if the request isn’t made by a user who is logged in. Further, it creates a dependency for the output cached content to a cache key that is cleared from the cache whenever a page (a PageData object) is updated or removed. This ensures that the output cache is cleared, for all pages, whenever an editor publishes a new version or removes a page. Last but not least the SetCachePolicy method also checks the CurrentPage property of the ASPX page being cached to see if the PageData object in the CurrentPage property has a date set for when it should be unpublished. If it does, the SetCachePolicy method will ensure that the cache duration is set to no longer than to the time when the page is unpublished.

In other words, thanks to the SetCachePolicy method we can use output caching and trust that the cache is cleared when it needs to be. With a few exceptions.

A couple of limitations

While the output caching that EPiServer CMS along with ASP.NET offers is pretty intelligent and aware of how our CMS based application works there are a couple of limitations that we should be aware of.

While the SetCachePolicy method will ensure that the output cache is cleared if the currently viewed page is unpublished we might be displaying content from other pages as well, such as their names and URLs and names in menus or as teasers. If such a page is unpublished because we have moved beyond it’s stop publish date the content will still be in the cache. While this probably isn’t a major problem in most situations we should be aware of it and think of what consequences it may render if we use output caching.

Another, often more important, limitation of the output cache is that it isn’t aware of third party data sources such as databases other than EPiServer CMS’s, search engines etc and therefore doesn’t know which query string parameters to vary by and when to clear the cache when data is added, updated or removed.

Sometimes, such as in the case where we are creating a search page and still want to use output caching but want it to cache different output depending on a query string parameter with a search query in it, we can handle that by specifying more parameters to the httpCacheVaryByParams attribute in episerver.config.

In other cases however we might have to override the SetCachePolicy method to disable output caching for a specific page, like this:

protected override void SetCachePolicy()
{
  Response.Cache.SetExpires(DateTime.Now);
  Response.Cache.SetCacheability(HttpCacheability.Private);
}

Alternatives to EPiServer CMS’s output caching

Output caching can be a very powerful tool to ease the strain on the web server(s). The server does have to keep all that markup in memory though. An alternative where the majority of the requests doesn’t even reach the web server is to use a web accelerator such as Varnish.

If we are dealing with a site that has a lot of traffic and we do use output caching on the web servers we are faced with the problem with what happens when the cache expires for a page and at the same time hundreds or thousands of visitors are requesting it almost simultaneously. In that case each request that comes in before the output has been cached again will force the web server to render all of the content. In a normal scenario this wont be a problem, but if we are talking about very high traffic scenarios it is. Then the standard output caching functionality won’t cut it and we’ll need to implement a more advanced functionality that deals with this problem. Unfortunately that’s beyond the scope of this post :-) 

When to use output caching

Whether we should use output caching is very dependent on context. Given that we have efficient and working object caching output caching primarily lowers the load on the web servers as they won’t have to render each individual page over and over again. While that lowering the load on the web servers for rendering pages might seem like a great idea in reality it often isn’t a very hard job for the web servers to do and we can get by fine without output caching.

Some experienced developers might tell you that output caching should always be turned on for EPiServer CMS sites. While that might have been true with older versions when the object cache wasn’t as efficient as it is now I, based on my experiences so, disagree with that. I’ve built several highly responsive web sites that can handle high loads of traffic without using output caching. Using output caching adds extra complexity and another possible source of bugs, a price I’m not willing to pay unless I really have to. Further I think output caching can hide problems with faulty object caching which you will still suffer from when the output hasn’t yet been cached, making such bugs, and sometimes other types of bugs as well, harder to find. And last but not least, I like to spare precious memory on the web servers for object caching.

With that said output caching can be very effective for web sites with extreme amounts of traffic or web sites with huge amounts of content on some pages (the front page of news papers for instance which can often contain several hundreds teasers) making rendering time on the web server a real issue or when we need to make a site with faulty object caching work while we fix the real problem.

My recommendation, if any, would be to try both with and without output caching. If you can make major gains in responsiveness or performance with output caching then use it, otherwise don’t.

Further reading

This post dealt with output caching. In most cases the more important type of caching when working with EPiServer is object caching (also known as data caching). That’s a topic that I address in the post How EPiServer CMS caches PageData objects.

In the post Cache objects in EPiServer with page dependencies by Ted Nyberg you can learn more about working with cache dependencies and why you should use EPiServer’s CacheManager class instead of the HttpRuntime.Cache.

PS. For updates about new posts, sites I find useful and the occasional rant you can follow me on Twitter. You are also most welcome to subscribe to the RSS-feed.

Joel Abrahamsson

Joel Abrahamsson

I'm a passionate web developer and systems architect living in Stockholm, Sweden. I work as CTO for a large media site and enjoy developing with all technologies, especially .NET, Node.js, and ElasticSearch. Read more

Comments

comments powered by Disqus

My book

Want a structured way to learn EPiServer 7 development? Check out my book on Leanpub!

More about EPiServer CMS