Of all the products shipped in the ALUI line, other than the portal itself, the most likely candidate for load balancing/failover/clustering is the Collaboration server. Here's a brief explanation of how Collaboration clustering works, and what you can do to watch it in action.
The clustering mechanism itself uses an open source implementation of JChannel/JGroups. This allows most of the mechanism itself to be abstracted, even from the product code, and leave the implementation details to someone else. That said, if you want to know how it works, take a look at the cluster.xml file under $PT_HOME/ptcollab/4.2/settings/config (you can turn clustering on and off in config.xml).
By default, collab uses a multicast UDP approach to clustering. When something happens on one collaboration instance, it broadcasts this event to a multicast UDP address and continues on its merry way. It uses a custom JChannel implementation to ensure reliable delivery, find out which nodes in the cluster are alive, and order messages. If you are curious to know more, check out the lan-multicast-cluster element in cluster.xml. You can find some interesting documentation on this here.
Some networks, however, don't support multicast UDP as a transport mechanism. After all, it can be chatty in its implementation, and is blocked a lot of times at the router level. As such, JGroups allows for implementation of the same protocol using unicast TCP or UDP. You can configure this in the cluster.xml as well (although you will have to specify which hosts to broadcast to). Take a look at the comments in this file to find out more.
Finally, there are going to be times when you are going to want to know what's going on with the cluster. Luckily, I recently stumbled across a built-in utility while digging around in the collaboration code. It can be run from a command line, and only requires that you extract the javagroups.jar from the collaboration.war before running it (thanks, Kenan, for the script):
SNOOP_HOME=$PT_HOME/ptcollab/4.2 export SNOOP_HOME java -cp $SNOOP_HOME/lib/java/collab-core.jar:$SNOOP_HOME/webapp/temp/WEB-INF/lib/javagroups.jar com.plumtree.core.cluster.tool.ClusterSnoop $SNOOP_HOME/settings/config/cluster.xml
Here is some sample output from the utility:
Message Source = localhost:51044 Cluster Message = com.plumtree.core.cluster.message.PingClusterMessage[cluster ping] Message Source = localhost:51026 Cluster Message = com.plumtree.core.cluster.message.PingClusterMessage[cluster ping] Message Source = localhost:51044 Cluster Message = com.plumtree.core.cluster.message.PingClusterMessage[cluster ping] Message Source = localhost:51026 Cluster Message = com.plumtree.core.cluster.message.PingClusterMessage[cluster ping] Message Source = localhost:51044 Cluster Message = com.plumtree.core.cluster.message.PingClusterMessage[cluster ping] Message Source = localhost:51026 Cluster Message = com.plumtree.core.cluster.message.GadgetCacheInvalidateClusterMessage[projectID=00000,functionalArea=1] Message Source = localhost:51026 Cluster Message = com.plumtree.core.cluster.message.ProjectModifiedClusterMessage[projectID=00000] Message Source = localhost:51026 Cluster Message = com.plumtree.core.cluster.message.PingClusterMessage[cluster ping] Message Source = localhost:51026 Cluster Message = com.plumtree.core.cluster.message.PreferenceInvalidateClusterMessage[preference=G_0_tcic] Message Source = localhost:51026 Cluster Message = com.plumtree.core.cluster.message.GadgetCacheInvalidateClusterMessage[projectID=00000,functionalArea=16] Message Source = localhost:51026 Cluster Message = com.plumtree.core.cluster.message.GadgetCacheInvalidateClusterMessage[projectID=00000,functionalArea=63] Message Source = localhost:51026 Cluster Message = com.plumtree.core.cluster.message.GadgetCacheInvalidateClusterMessage[projectID=00000,functionalArea=16] Message Source = localhost:51044 Cluster Message = com.plumtree.core.cluster.message.PingClusterMessage[cluster ping] Message Source = localhost:51026 Cluster Message = com.plumtree.core.cluster.message.PingClusterMessage[cluster ping] Message Source = localhost:51044
Cool, huh?
Leave a comment