High Availability for Gateways
You can set up multiple RSRemote instances and configure them for failover using a load balancer. This High Availability setup is independent of the gateway type.
Resolve Actions Pro uses Apache Reverse Proxy to simulate a load balancer for the failover between RSRemote instances.
Failover and Failback
To achieve a High Availability cluster for Gateway Builder, Actions Pro provides an out of the box failover and failback mechanism. A standard cluster set-up consists of one RSRemote acting as primary instance for Builder Gateway, another RSRemote acting as secondary instance for Builder Gateway and any number of RSRemote(s) acting as workers for Bulder Gateway. The same RSRemote can be acting as a primary instance and a worker instance at the same time.
An example set-up using Gateway Builder with three RSRemotes can be configured as follows:
- RSRemote 1 - acting as Builder Gateway primary instance and Builder Gateway worker instance;
- RSRemote 2 - acting as Builder Gateway secondary instance and Builder Gateway worker instance;
- RSRemote 3 - acting as Builder Gateway work instance;
To configure the RSRemote(s) as shown above, configure the following blueprint properties on each of them:
//RSRemote1
rsremote.receive.builder.primary = true
rsremote.receive.builder.secondary = false
rsremote.receive.builder.worker = true
rsremote.receive.builder.queue = BUILDERCLUSTER
rsremote.receive.builder.hearbeat = 20
rsremote.receive.builder.failoiver = 20
//RSRemote2
rsremote.receive.builder.primary = false
rsremote.receive.builder.secondary = true
rsremote.receive.builder.worker = true
rsremote.receive.builder.queue = BUILDERCLUSTER
rsremote.receive.builder.hearbeat = 20
rsremote.receive.builder.failoiver = 20
//RSRemote3
rsremote.receive.builder.primary = false
rsremote.receive.builder.secondary = false
rsremote.receive.builder.worker = true
rsremote.receive.builder.queue = BUILDERCLUSTER
rsremote.receive.builder.hearbeat = 20
rsremote.receive.builder.failoiver = 20
The following items describe the basic functionality of failover and failback and all properties involved in the context of the example provided above.
The Builder Primary instance runs the groovy script to fetch events from third-party and add them to the Gateway Builder queue. In the above example RSRemote1 acts as primary and retrieves events from third-party and sends them to the BUILDERCLUSTER queue.
All worker instances, in this case RSRemote1, RSRemote2 and RSRemote3 subscribe to Gateway Builder Gateway queue called BUILDERCLUSTER queue in this case. They pick up events passed into the queue and trigger Gateway Script, Resolution Routing and Runbooks to process the events.
Secondary instance (RSRemote2) receives from primary (RSRemote1) instance a hearbeat at a configurable interval that keeps the primary and secondary aware of each other. In this example RSRemote2 receives a heartbeat from RSRemote1 every 20 seconds.
If Primary is offline (not able to pull events due to rsremote crash or the server that it is hosted on having network issues or if the server it is hosted on is powered off), the Secondary after certain time, notices this and becomes the Primary and starts listening to events. This process is called Failover. In the above example RSRemote2 takes over the role of RSRemote1 when it crashes or goes offline. The configurable time is set by the failover property and calculated on the secondary starting from the last hearbeat it missed by the Primary. In this case RSRemote2 receives a hearbeat from RSRemote1 every 20 seconds. When it stops receiving heartbeat it waits for 20 seconds plus and additional 60 more seconds and then starts performing the role of Primary instance.
When primary comes back up, the secondary instance notices this and goes back to become secondary.
If you have multiple gateways, it is a good practice to ensure all Primary and Secondary instances are grouped in a way that all Primaries are on the same RSremote, all Secondaries are on the same RSRemote and so forth. In the example above we have ServiceNow and Builder gateways which illustrates this point. This ensures that the HA functionality of one gateway does not interfere with the cluster setup of other gateways.