Setting Up Distributed Infinispan Cache with Hibernate and Spring

A pretty typical setup – spring/hibernate application that requires a distributed cache. But it turns out not so trivial to setup.

You obviously need cache. There are options to do that with EhCache, Hazelcast, Infinispan, memcached, Redis, AWS’s elasticache and some others. However, EhCache supports only replicated and not distributed cache, and Hazelcast does not yet work with the latest version of Hibernate. Infinispan and Hazelcast support consistent hashing, so the entries live only on specific instance(s), rather than having a full copy of all the cache on the heap of each instances. Elasticache is AWS-specific, so Infinispann seems the most balanced option with the spring/hibernate setup.

So, let’s first setup the hibernate 2nd level cache. The official documentation for infinispan is not the top google result – it is usually either a very old documentaton, or just 2 versions old documentaton. You’d better open the latest one from the homepage.

Some of the options below are rather “hidden”, and I couldn’t find them easily in the documentation or in existing “how-to”s.

First, add the relevant dependencies to your dependency manager configuraton. You’d need infinispan-core, infinispan-spring and hibernate-infinispan. Then in your configuratoin file (whichever it is – in my case it is jpa.xml, a spring file that defines the JPA properties) configure the following:

<prop key="hibernate.cache.use_second_level_cache">true</prop>
<prop key="hibernate.cache.use_query_cache">true</prop>
<prop key="hibernate.cache.region.factory_class">org.hibernate.cache.infinispan.InfinispanRegionFactory</prop>
<prop key="hibernate.cache.inifinispan.statistics">true</prop>
<prop key="hibernate.cache.infinispan.cfg">infinispan.xml</prop>
<prop key="hibernate.cache.infinispan.query.cfg">distributed-query</prop>

These settings enable 2nd level cache and query cache, using the default region factory (we’ll see why that may need to be changed to a custom one later), enable statistics, point to an infinispan.xml configuraton file and change the default name for the query cache in order to be able to use a distributed one (by default it’s “local-cache”). Of course, you can externalize all these to a .properties file.

Then, at the root of your classpath (src/main/resources) create infinispan.xml:

<?xml version="1.0" encoding="UTF-8"?>
<infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:infinispan:config:8.1 http://www.infinispan.org/schemas/infinispan-config-8.1.xsd
                            urn:infinispan:config:store:jdbc:8.0 http://www.infinispan.org/schemas/infinispan-cachestore-jpa-config-8.0.xsd"
    xmlns="urn:infinispan:config:8.1">
    <jgroups>
        <stack-file name="external-file" path="${jgroups.config.path:jgroups-defaults.xml}" />    
    </jgroups>
    <cache-container default-cache="default" statistics="true">
        <transport stack="external-file" />
        <distributed-cache-configuration name="entity" statistics="true" />
        <distributed-cache-configuration name="distributed-query" statistics="true" />
    </cache-container>
</infinispan>

This expects -Djgroups.config.path to be passed to the JVM to point to a jgroups configuration. Depending on whether you use your own setup or AWS, there are multiple options. Here you can find config files for EC2, Google cloud, and basic UDP and TCP mechanism. These should be placed outside the project itself, because locally you most likely don’t want to use S3_PING (S3 based mechanism for node detection), and values may vary between environments.

If you need statistics (and it’s good to have them) you have to enable them both at cache-container level and at cache-level. I actually have no idea what the statistics option in the hibernate properties is doing – it didn’t change anything for me.

Then you define each of your caches. Your entities should be annotated with something like

 
@Cache(usage = CacheConcurrencyStrategy.READ_WRITE, region = "user")
public class User { .. }

And then Infinispan creates caches automatically. They can all share some default settings, and these defaults are defined for the cache named “entity”. Took me a while to find that out, and finally got an answer on stackoverflow. The last thing is the query cache (using the name we defined in the hibernate properties). Note the “distributed-cache-configuration” elements – that way you you explicitly say “this (or all) cache(s) must be distributed” (they will use the transport mechanism specified in the jgroups file). You can configure defaults in a jgroups-defaults.xml and point to it as shown in the above example, if you don’t want to force developers specify the jvm arguments.

You can define entity-specific properties using <distributed-cache-configuration name="user" /> for example (check the autocomplete from the XSD to see what configuration options you have (and XML is a pretty convenient config DSL, isn’t it?).

So far, so good. Now our cache will work both locally and on AWS (EC2, S3), provided we configure the right access keys, and locally. Technically, it may be a good idea to have different infinispan.xml files for local and production, and to define by default <local-cache>, rather than a distributed one, because with the TCP or UDP settings, you may end up in a cluster with other teammates in the same network (though I’m not sure about that, it may present some unexpected issues).

Now, spring. If you were to only setup spring, you’d create a bean with a SpringEmbeddedCacheManagerFactoryBean, pass classpath:infinispan.xml as resource location, and it would work. And you can still do that, if you want completely separated cache managers. But Cache managers are tricky. I’ve given an outline of the problems with EhCache, and here we have to do some workarounds in order to have a cache manager shared between hibernate and spring. Whether that’s a good idea – it depends. But even if you need separate cache managers, you may need a reference to the hibernate underlying cache manager, so part of the steps below are still needed. A problem with using separate caches is the JMX name they get registered under, but that I guess can be configured as well.

So, if we want a shared cache manager, we have to create subclasses of the two factory classes:

/**
 * A region factory that exposes the created cache manager as a static variable, so that
 * it can be reused in other places (e.g. as spring cache)
 * 
 * @author bozho
 *
 */
public class SharedInfinispanRegionFactory extends InfinispanRegionFactory {

	private static final long serialVersionUID = 1126940233087656551L;

	private static EmbeddedCacheManager cacheManager;
	
	public static EmbeddedCacheManager getSharedCacheManager() {
		return cacheManager;
	}
	
	@Override
	protected EmbeddedCacheManager createCacheManager(ConfigurationBuilderHolder holder) {
		EmbeddedCacheManager manager = super.createCacheManager(holder);
		cacheManager = manager;
		return manager;
	}
	
	@Override
	protected EmbeddedCacheManager createCacheManager(Properties properties, ServiceRegistry serviceRegistry)
			throws CacheException {
		EmbeddedCacheManager manager = super.createCacheManager(properties, serviceRegistry);
		cacheManager = manager;
		return manager;
	}
}

Yup, a static variable. Tricky, I know, so be careful.

Then we reuse that for spring:

/**
 * A spring cache factory bean that reuses a previously instantiated infinispan embedded cache manager
 * @author bozho
 *
 */
public class SharedInfinispanCacheManagerFactoryBean extends SpringEmbeddedCacheManagerFactoryBean {
        private static final Logger logger = ...;
	@Override
	protected EmbeddedCacheManager createBackingEmbeddedCacheManager() throws IOException {
		EmbeddedCacheManager sharedManager = SharedInfinispanRegionFactory.getSharedCacheManager();
		if (sharedManager == null) {
			logger.warn("No shared EmbeddedCacheManager found. Make sure the hibernate 2nd level "
					+ "cache provider is configured and instantiated.");
			return super.createBackingEmbeddedCacheManager();
		}
		
		return sharedManager;
	}
}

Then we change the hibernate.cache.region.factory_class property in the hibernate configuration to our new custom class, and in our spring configuration file we do:

<bean id="cacheManager" class="com.yourcompany.util.SharedInfinispanCacheManagerFactoryBean" />
<cache:annotation-driven />

The spring cache is used with a mehtod-level @Cacheable annotation that allows us to cache method calls, and we can also access the CacheManager via simple injection.

Then the “last” part is to check if it works. Even if your application starts ok and looks to be working fine, you should run your integration or selenium test suite and check the statistics via JMX. You may even have tests that use the MBeans to fetch certain stats data about the caches to make sure they are being used. And/or you can write an integraton test that injects the CacheManager and uses the StandardCacheEntryImpl it turns to compare the version property after subsequent operations, to see if the cache is properly updated.

Overall, it shouldn’t take much time to set the whole thing up, and then later even to replace with another implementation if necessary.

Share Button

4 thoughts on “Setting Up Distributed Infinispan Cache with Hibernate and Spring”

  1. We’ve been using Infinispan for four years now. It’s got an impressive array of features, some of them quite useful, depending on your data needs of course. Our strategy has been to stick to tried and tested stuff though, for safety reasons.

    Operating an Infinispan cluster in production can be a bit tricky. Customers occasionally report unexpected exceptions when trying to add nodes, or when a node fails / gets disconnected. This stuff can be bloody hard to diagnose and recover from. In the worst case you may have to restart the entire cluster, which is not nice. I pray such troubles do not befall you 🙂

    If anyone is wants to run Infinispan on AWS – I maintain a short guide on that:

    http://connect2id.com/blog/how-to-configure-infinispan-for-aws-s3ping-discovery

    For setting up Infinispan on AWS with S3-bucket based discovery you can check out the following guide:

    http://connect2id.com/blog/how-to-configure-infinispan-for-aws-s3ping-discovery

  2. PS: If you’ve got tests with Infinispan that don’t need to touch its clustering capabilities – always do them with local caches. Otherwise you may get spurious exceptions or timeouts.

    On a dev machine Infinispan you can also have trouble starting up – for no apparent reason – if more than one networking interface is available, e.g. LAN and wireless.

Leave a Reply

Your email address will not be published. Required fields are marked *