Caching


Content Author

Dan Skaggs

dskaggs

Reviewed/Revised By

Matthew Clemente

mjclemente84

As applications grow and scale, you'll often encounter the need to improve performance. The list of things that can slow down your application is long: larger recordsets, more complex queries, heavier server load, slow external resources, expensive object creation, etc. Coldfusion's caching functionality is one of your best tools to combat these, helping to scale and accelerate your application.

What do we mean by caching? At its most basic, caching means doing work one time and then reusing the result if it's needed again. Said slightly more technically, caching refers to storing the result of an operation (data) in memory. While that data is cached, your application is able to retrieve it quickly, rather than needing to complete the operation again.

ColdFusion provides a variety of tools for caching that can be divided into two broad categories:

  1. Programmatic caching: The caching functionality you control within your application's codebase.
  2. Application server caching: Managed at the server level, these caching tools related to how ColdFusion compiles and reuses your code.

Programmatic Caching

ColdFusion provides functions, tags, and attributes that give you a high degree of control over what content within your application gets cached, and for how long. These are the tools you will use when optimizing application performance. Some of the most common are cachePut/cacheGet, cfcache, and the cachedWithin attribute for cfquery. Which of these you'll use most is largely going to depend on the structure of your application.

Manual Data Caching

Business logic is typically written in CFScript, making cachePut and cacheGet convenient choices for managing your cached data. At their most basic, here's how they work:

<cfscript>
    data = { name = 'foo', key = 'bar' };
    cachePut( 'id_for_cached_data', data, );
    data = cacheGet( 'id_for_cached_data' );
</cfscript>

In this example, you have some arbitrary data that you need cached. Using cachePut, you add the data the default cache, using a unique identifier. Then, any time you need that data within your application, you're able to retrieve it using the unique identifier, via cacheGet.

A Closer Look at cachePut

The cachePut function takes two optional arguments that provide fine-grained control over how long an object exists in the cache:

<cfscript>
    cachePut( id, value, [timeSpan], [idleTime] );
</cfscript>

Here's how they work:

  • timeSpan: The length of time the object should persist in the cache before being automatically flushed. The easiest way to set it is with createTimeSpan
  • idleTime: If the object hasn't been retrieved from the cache for this length of time, it is automatically flushed. Again, createTimeSpan is the easiest way to set it.

Here's what using these arguments might look like:

<cfscript>
    cachePut( 'id_for_cached_data', data, createTimeSpan( 0, 0, 30, 0 ), createTimeSpan( 0, 0, 15, 0 ) );
</cfscript>

In this case, we're instruction ColdFusion to store the data in the cache for 30 minutes, but to flush it after 15 minutes if it hasn't been accessed. Note that for the idleTime argument to be of use, it should be less than the timeSpan.

You should know that cachePut takes two optional arguments: region and throwOnError - these are explained further in the next article.

Using cacheGet

When it comes to retrieving data from the cache, there's one primary concern as a developer - is it actually there? Data may not be in the cache for a variety of reasons: you haven't put it there yet, it was manually evicted using cacheRemove, or automatically evicted due to cache settings. You'll need to account for this possibility in your code.

When cacheGet is passed an identifier that isn't present in the cache, the function returns null. Consequently, nearly any time that you use cacheGet, you'll need to examine the result using isNull, to confirm that it contains data. Here's a slightly contrived example of how that might work within a function:

<cfscript>
    function getData() {
        var key = 'id_for_cached_data';
        var data = cacheGet( key );
        if( isNull( data ) ) {
            var data = { name = 'foo', key = 'bar' };
            cachePut( key, data, );
        }
        return data;
    }
</cfscript>

The first time this function is called, cacheGet will not find the requested data in the cache so isNull will return true. The data is then created and put it into the cache, so that it will be present next time the function is called. Finally, the data is returned. If the data took more time to process than in this trivial example, this would mean that while the first request would be slow, subsequent calls to this function would be considerably faster.

A Concrete Example

The examples used for cachePut and cacheGet have been abstract and trivial - that is, not particularly illustrative of why these functions are useful. Let's take a quick look at a slightly more concrete example.

Along with slow queries and complex data manipulation, network connections are a common processing bottleneck. Caching the results of these calls can considerably improve application performance. Let's update the earlier cacheGet example to involve an HTTP request:

<cfscript>
    function getData() {
        var key = 'coldfusion.adobe.com';
        var webpage = cacheGet( key );
        if( isNull( webpage ) ) {
            var req = '';
            cfhttp( url="https://coldfusion.adobe.com/", resolveurl=true, result="req" );
            var webpage = req.filecontent;
            cachePut( key, webpage );
        }
        return webpage;
    }
</cfscript>

Uncached, the call to retrieve the content from ColdFusion.Adobe.com takes approximately 500-1200 ms to complete. Once cached, it takes less than 1 ms to return the data. Measured across an entire application, improvements like this add up quickly; obviously, they're even more impressive relative to how slow the initial process or query is.

Manually Purging the Cache

There are times you'll want to evict cached data immediately, rather than waiting for automatic expiration - perhaps you know that the data has been changed, or deleted, or you just want to force a fresh version to be created. In these cases, the function of choice is cacheRemove.

At its most basic, cacheRemove functions as the counterpoint to cacheGet, also taking the identifier of cached data, but using it to evict that data from the cache:

<cfscript>
    cacheRemove( 'id_for_cached_data' );
</cfscript>

However, there's more to this function; the first nuance being that instead of a single identifier, it can take a comma separated list or array of identifiers to remove:

<cfscript>
    idsToBeRemove = [ 'foo_1', 'foo_2', 'foo_3' ];
    cacheRemove( idsToBeRemove );
</cfscript>

There are three optional arguments for cacheRemove:

<cfscript>
    cacheRemove( id, [throwOnError], [region], [exact] );
</cfscript>

We'll discuss the latter two in the next article; here's how throwOnError works. By default, ColdFusion does not care if the identifiers passed to cacheRemove exist in the cache - if they're present, they are removed, otherwise nothing happens. As the name suggestions, this argument overrides the default behavior and will throw an error if an identifier is not found in the cache. I have never found a need to use this functionality.

Getting Information About the Cache

At this point, we've put data into the cache, retrieved it, and purged it. But how do we examine the data that's in our cache at any given moment? For this, ColdFusion gives us cacheGetAllIds(), which returns an array of all the identifiers of objects stored in the cache, as you can see here:

<cfscript>
    cachePut( 'learncfinaweek', 7 );
    cachePut( 'learncfinaday', 1 );
    writeDump( var='#cacheGetAllIds()#' );
</cfscript>

This is helpful for programmatically examining and manipulating the cache. We're provided a further degree of insight into our cached data itself via cacheGetMetadata. When passed an identifier, this function returns a struct with information about the corresponding cached data, including the date/time it was created, the last time it was accessed, its size, cache hitcount, and more:

<cfscript>
    cachePut( 'learncfinaweek', 7, createTimespan( 2, 0, 0, 0) );
    cacheGet( 'learncfinaweek' );
    writeDump( var='#cacheGetMetaData( 'learncfinaweek' )#' );
</cfscript>

Used separately or together, these functions will let you know how your cache is performing, expose the data contained within it, and enable you to fine tune your caching strategy, if necessary. Not every application requires this degree of manual intervention in caching. By all means explore what's possible with these functions, but don't automatically assume that you will need them. Should you need this information while performance tuning, now you know how to get it.

Query Caching

Query caching stores the result of a queryExecute or cfquery action in memory so you can easily reuse the data without needing to retrieve it from the database. If a query's result set changes infrequently or it takes some time to execute, it's likely a good candidate for caching. If you're not already using query caching, it is one of the easiest and fastest ways to improve application performance - database activity is frequently one of the most costly processes on a server, so reducing it can have substantial benefits.

Consider a query that returns the countries a company ships its products to. The countries don't change often, so this query would be a good candidate to be cached by ColdFusion. The original code might look something like this:

<cfscript>
    sql = "SELECT countryId, countryName FROM countries";
    qryShipToCountries = queryExecute( sql );
</cfscript>

This code results in a call to the database every time you need the list of countries. By adding one query option, cachedWithin, to this function, you can dramatically reduce the number of times the query is run. In the following, updated code, the query will be cached in server memory for 1 hour. No matter how many times the application requires that country list, the query will be run a maximum of only 24 times per day.

<cfscript>
    sql = "SELECT countryId, countryName FROM countries";
    queryOptions = { cachedWithin = createTimeSpan( 0, 1, 0, 0 ) };
    qryShipToCountries = queryExecute( sql, {}, queryOptions );
</cfscript>

Using cfquery, the same cached query would look like this:

<cfquery name="qryShipToCountries" cachedWithin="#createTimeSpan( 0, 1, 0, 0 )#">
    SELECT countryId, countryName
    FROM countries
</cfquery>

The cachedWithin option/attribute instructs ColdFusion to execute this query ONLY if it does not have a result set from that exact query in memory already. The createTimeSpan() method is used to give ColdFusion a timespan during which to serve the cached results. Once the timespan has elapsed, it will execute the query again and cache the result of that execution. In this example, the timespan is 0 days, 1 hour, 0 minutes, and 0 seconds.

If you're new to query caching, you may be wondering about how dynamic queries are handled. The following code, which retrieves a user based on the userId variable, may help illustrate this concern:

<cfscript>
    sql = "SELECT userId, username, favoriteColor
        FROM users
        WHERE userId = :userId";
    params = {
        userId = { value = userId, cfsqltype = "cf_sql_integer" },
    };
    queryOptions = { cachedWithin = createTimeSpan( 0, 1, 0, 0 ) };
    qryGetUserById = queryExecute( sql, params, queryOptions );
</cfscript>

And again, for the sake of illustration, here is the code example using the cfquery tag:

<cfquery name="qryGetUserById" cachedWithin="#createTimeSpan( 0, 1, 0, 0 )#">
    SELECT userId, username, favoriteColor
    FROM users
    WHERE userId = <cfqueryparam cfsqltype="cf_sql_integer" value="#userId#">
</cfquery>

If an imaginary user, Bob, logs in and this query returns his information, will the cached query also return Bob's information when the next user logs in?

Fortunately, the answer is no. ColdFusion bases its query caching on more than just the query name. From the ColdFusion documentation: "To use cached data, the current query must use the same SQL statement, data source, query name, user name, and password." In the current example, because the userId value changes, the SQL statement is different - each user would get their own version of the query cached.

Caching Generated Content with cfcache

ColdFusion provides the cfcache tag for caching generated HTML output, either for an entire template or for HTML fragments. An example of this might be a page listing products that all belong to the same category. On high-traffic sites, caching the rendered output for this page, even for as little as 1-2 minutes, could cause the application to run significantly faster. To do this, simply place a cfcache tag at the start of the page:

<cfcache action="cache" timespan="#createtimespan(0,0,2,0)#" usequerystring="true">
    <html>
        <head></head>
        <body>
        ...page content here...
        </body>
    </html>
</cfcache>

Notice that in this example we included the attribute usequerystring. By default, URL parameters are ignored when saving generated templates or fragments to the cache. Setting this option to true instructs ColdFusion to cache each unique URL separately, because the content being cached depends on the URL parameters. So, for example, report.cfm?state=ca and report.cfm?state=nj would be stored as two separate cache entries.

While it has its place, caching an entire template like this isn't very flexible; it doesn't allow for dynamic elements (and it will even cache debugging output). For a more selective approach, you can cache HTML fragments by wrapping sections of output in a cfcache tag:

<cfoutput>Hello, it's #now().datetimeformat( 'long' )#.</cfoutput>
<cfcache timespan="#createTimeSpan( 1, 0, 0, 0 )#">
    <!--- Content to be cached --->
</cfcache>

In this example, the date/time would be output dynamically on every page load, while the content wrapped within the cfcache tags would be cached for a day.

Note that when caching fragments with cfcache, the generated HTML content which gets cached is not reusable on other pages within the application; it is limited to the template on which it was generated and ColdFusion handles storing and retrieving it from the cache.

The cfcache tag provides a manual option for removing content from the cache as well. This is accomplished by using action="flush" on the cfcache tag. Coupled with the expireURL attribute, you can flush the entire cache, a group of pages that match a particular URL pattern, or a single content item. For example:

<cfcache action="flush" expireURL="/report.cfm?state=">

This is useful when the rendered content that you're caching would be affected by changes to the database that drives it.

A final tip: the undocumented function getAllTemplateCacheIds() returns an array of keys for the templates that are currently cached. Programmatically there's not much you can do with this, but it can be helpful when you're trying to understand how ColdFusion's template cache works.

Application Server Caching

In addition to programmatic methods of caching, ColdFusion provides caching mechanisms and settings managed at the server level; that is, they apply to all ColdFusion applications being served by that particular server instance. These settings can be found in the ColdFusion Administrator web application, under "Server Settings" -> "Caching". Note that in more advanced use cases, these settings can can be managed via the Admin API.

For our purposes, the three most important settings here are Maximum Number of Cached Templates, Trusted Cache and Save Class Files. The latter two are caching mechanisms intended for use on production systems and generally are not recommended for use on a ColdFusion instance being used for development. We'll touch on some of the other options and settings available here, including Cache Template in Request - you can find all available settings in the Administrator documentation.

Before diving in, it's important to understand what is meant by the template cache. Before your ColdFusion code runs, it is converted to Java bytecode. To reduce the need to read from disk, the compiled code is stored in memory. This is the template cache - an in-memory cache of compiled Java bytecode that ColdFusion generates from your CFC and CFM files. Because code in memory is faster than when it needs to be read from disk, managing the template cache is key to performance tuning your applications.

Maximum Number of Cached Templates

This settings, which defaults to 1024, controls the number of templates that will be stored in the template cache. I'll just quote the text in the ColdFusion Administrator here, as it does a good job of explaining the setting: "If the cache is set to a small value, ColdFusion might re-process your templates. If your server has a sufficient amount of memory, you can achieve optimum performance by setting this value to the total number of all of your ColdFusion templates. Setting the cache to a high value does not automatically reduce available memory because ColdFusion caches templates incrementally." The tradeoff here is memory usage vs. performance - finding the balance is a matter of monitoring your server. Options for monitoring include the Performance Monitoring Toolset or a third-party tool, such as FusionReactor.

Trusted Cache

By default, during each request ColdFusion checks if the underlying files have been changed - if any have been modified, the Java bytecode is recompiled, otherwise the previously compiled bytecode is used. While this is ideal behavior during development, when code updates need to be reflected immediately, these checks for file modifications are not free of overhead. Obviously, the compiled code would run faster if it didn't need to check for changes.

On production systems, where changes to the underlying code are less frequent, you can eliminate the overhead of checking for code changes by enabling the Trusted Cache. Once the Java bytecode for a template is compiled, it is trusted to be up-to-date and is used for all future requests. Enabling this option can have significant performance benefits.

Be sure you understand this: when the Trusted Cache is enabled, changes to files will not automatically be reflected - you won't see your updates unless you manually clear the Template Cache. Clearing the template cache will force the Java bytecode to be recompiled and your updates will be reflected. This can be done both within the ColdFusion Administrator or programmatically, via the Admin API.

Save Class Files

Unlike the Trusted Cache, this option is enabled by default and the performance gains that it provides are typically minimal. When enabled, the Java class files generated by ColdFusion are saved to disk. During a restart, ColdFusion will read them back into memory, rather than recompiling the source code to Java bytecode again. Generally this is recommended for production, but disabled for development.

Cache Template in Request

By default, ColdFusion inspects a file for changes once during a request. During its first invocation, the file is scanned for changes, then compiled to Java bytecode. That same bytecode is used for each subsequent call to the file within that request.

If you disable the Cache Template in Request option, ColdFusion will check for changes every time a file is used during the same request. This increases file system overhead, so unless you are expecting your files to change mid-request, you should keep this option enabled.

Note that if you are using the Trusted Cache, you won't see any performance increases from also enabling Cache Template in Request, as Trusted Cache supersedes this option.

Further Caching Settings

Your system has a finite amount of memory, and caching done incorrectly can cause more trouble than you might initially think. There are a number of settings in the ColdFusion Administrator to help manage and control your server's cached data. These are all found in the ColdFusion Administrator -> Server Settings -> Caching.

  • Component Cache: This setting, which instructs ColdFusion to cache the path to components, is enabled by default. In most cases, this is optimal. If you disable it, Coldfusion will resolve the path to components during each request, which can result in unnecessary disk I/O.
  • Clear Component Cache: If you have enabled Component Caching, this button enables you to flush that cache, forcing ColdFusion to re-resolve the component paths.
  • Clear Template Cache Now / Clear Folder Specific Template Cache: You can manually clear the entire server's template cache by clicking the 'Clear Template Cache Now' button. For a more targeted approach, you can clear the template cache of a specific folder by selecting it. Clearing the template cache is a necessary step when deploying to a production server, if the Trusted Cache is enabled.
  • Maximum Number of Queries Cached: ColdFusion will let you place a limit on the number of cached queries in your server. As a cached query is requested, it moves to the top of the stack. Unused queries move toward the bottom of the stack. The query unused the longest is evicted from the Query Cache as the cached query count hits the limit you set here in the ColdFusion Administrator. You can clear the cached queries by clicking the 'Clear Query Cache Now' button.