We use apache ignite v2.2 as hibernate 2nd level cache in grails application. We have 4 nodes cluster with 10G RAM each. The first node starts ok. But subsequent hangs. Sometimes 2nd sometimes 3rd or 4th. Also successful startups happen but very rare. App hangs always in the same place:
"host-startStop-1" #45 daemon prio=5 os_prio=0 tid=0x00007f7cac004800 nid=0x3d44 waiting on condition [0x00007f7cfdd81000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:216)
at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:158)
at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:150)
at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.onKernalStart(GridCachePartitionExchangeManager.java:551)
at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStart(GridCacheProcessor.java:843)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1040)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1896)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1648)
- locked <0x00000007890a1198> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1076)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:596)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:520)
at org.apache.ignite.Ignition.start(Ignition.java:322)
All other nodes are locked during this process. Configuration:
IgniteConfiguration configuration = new IgniteConfiguration()
List<CacheConfiguration> cacheConfigurations = []
for (String name : caches) {
CacheConfiguration cacheConfiguration = new CacheConfiguration<>()
cacheConfiguration.setCacheMode(CacheMode.REPLICATED)
cacheConfiguration.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL)
cacheConfiguration.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_ASYNC)
cacheConfiguration.setName(name)
cacheConfiguration.onheapCacheEnabled = true
cacheConfiguration.evictionPolicy = new LruEvictionPolicy()
cacheConfiguration.memoryPolicyName = MEMORY_POLICY
cacheConfigurations.add(cacheConfiguration)
}
for (String name : ['org.hibernate.cache.spi.UpdateTimestampsCache',
'org.hibernate.cache.internal.StandardQueryCache']) {
CacheConfiguration cacheConfiguration = new CacheConfiguration<>()
cacheConfiguration.setCacheMode(CacheMode.REPLICATED)
cacheConfiguration.setAtomicityMode(CacheAtomicityMode.ATOMIC)
cacheConfiguration.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_ASYNC)
cacheConfiguration.setName(name)
cacheConfiguration.onheapCacheEnabled = true
cacheConfiguration.evictionPolicy = new LruEvictionPolicy()
cacheConfiguration.memoryPolicyName = MEMORY_POLICY
cacheConfigurations.add(cacheConfiguration)
}
configuration.setCacheConfiguration(cacheConfigurations.toArray(new CacheConfiguration[cacheConfigurations.size()]))
configuration.peerClassLoadingEnabled = true
configuration.igniteInstanceName = Constants.IGNITE_GRID
configuration.gridLogger = new Slf4jLogger()
MemoryConfiguration memoryConfiguration = new MemoryConfiguration()
memoryConfiguration.defaultMemoryPolicySize = 1 * 1024 * 1024 * 1024l
MemoryPolicyConfiguration l2CachePolicy = new MemoryPolicyConfiguration()
l2CachePolicy.name = MEMORY_POLICY
l2CachePolicy.setMaxSize(4 * 1024 * 1024 * 1024l)
l2CachePolicy.pageEvictionMode = DataPageEvictionMode.RANDOM_LRU
memoryConfiguration.setMemoryPolicies(l2CachePolicy)
configuration.memoryConfiguration = memoryConfiguration
int[] eventTypes = new int[1]
eventTypes[0] = EventType.EVT_NODE_FAILED
configuration.includeEventTypes = eventTypes
Map<IgnitePredicate<? extends Event>, int[]> listeners = new HashedMap()
listeners.put(new NodeFailedEventListener(), eventTypes)
configuration.localEventListeners = listeners
TcpCommunicationSpi commSpi = new TcpCommunicationSpi()
commSpi.slowClientQueueLimit = 1000
commSpi.messageQueueLimit = 5000
configuration.communicationSpi = commSpi
TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi()
configuration.discoverySpi = discoverySpi
if (grailsApplication.config.grails?.plugin?.awssdk?.accessKey && Env.igniteS3Bucket) {
TcpDiscoveryS3IpFinder awsIpFinder = new TcpDiscoveryS3IpFinder()
awsIpFinder.setBucketName(Env.igniteS3Bucket)
AWSCredentials awsCredentials = new BasicAWSCredentials(grailsApplication.config.grails.plugin.awssdk.accessKey,
grailsApplication.config.grails.plugin.awssdk.secretKey)
awsIpFinder.setAwsCredentials(awsCredentials)
discoverySpi.ipFinder = awsIpFinder
} else {
TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder()
ipFinder.setAddresses(["127.0.0.1:47500"])
discoverySpi.ipFinder = ipFinder
}
configuration.classLoader = grailsApplication.classLoader
ignite = Ignition.start(configuration)
EDIT
Full thread dump of failed node
Full thread dump of succeed node
If you want to run more than 1 node on one physical machine, I would recommend configuring MemoryConfiguration(because by default, in version 2.2 Ignite will require 80% of physical RAM for one node) or update to version 2.3(default value was reduced to 20%)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With