331 lines
10 KiB
Text
331 lines
10 KiB
Text
|
<#import "/templates/guide.adoc" as tmpl>
|
||
|
<#import "/templates/links.adoc" as links>
|
||
|
|
||
|
<@tmpl.guide
|
||
|
title="Deploy an AWS Lambda to guard against Split-Brain"
|
||
|
summary="Building block for loadbalancer resilience"
|
||
|
tileVisible="false" >
|
||
|
|
||
|
This {section} explains how to reduce the impact when split-brain scenarios occur between two sites in a multi-site deployment.
|
||
|
|
||
|
This deployment is intended to be used with the setup described in the <@links.ha id="concepts-multi-site"/> {section}.
|
||
|
Use this deployment with the other building blocks outlined in the <@links.ha id="bblocks-multi-site"/> {section}.
|
||
|
|
||
|
include::partials/blueprint-disclaimer.adoc[]
|
||
|
|
||
|
== Architecture
|
||
|
In the event of a network communication failure between the two sites in a multi-site deployment, it is no
|
||
|
longer possible for the two sites to continue to replicate session state between themselves and the two sites
|
||
|
will become increasingly out-of-sync. As it is possible for subsequent Keycloak requests to be routed to different
|
||
|
sites, this may lead to unexpected behaviour as previous updates will not have been applied to both sites.
|
||
|
|
||
|
In such scenarios a quorum is commonly used to determine which sites are marked as online or offline, however as multi-site
|
||
|
deployments only consist of two sites, this is not possible. Instead, we leverage "`fencing`" to ensure that when one of the
|
||
|
sites is unable to connect to the other site, only one site remains in the loadbalancer configuration and hence only this
|
||
|
site is able to serve subsequent users requests.
|
||
|
|
||
|
As the state stored in {jdgserver_name} will be out-of-sync once the connectivity has been lost, a manual re-sync is necessary as described in <@links.ha id="operate-synchronize" />.
|
||
|
This is why a site which is removed via fencing will not be re-added automatically, but only after such a synchronisation using the mual procedure <@links.ha id="operate-site-online" />.
|
||
|
|
||
|
In this {section} we describe how to implement fencing using a combination of https://prometheus.io/docs/alerting/latest/overview/[Prometheus Alerts]
|
||
|
and AWS Lambda functions. A Prometheus Alert is triggered when split-brain is detected by the {jdgserver_name} server metrics,
|
||
|
which results in the Prometheus AlertManager calling the AWS Lambda based webhook. The triggered Lambda function inspects
|
||
|
the current Global Accelerator configuration and removes the site reported to be offline.
|
||
|
|
||
|
In a true split-brain scenario, where both sites are still up but network communication is down, it is possible that both
|
||
|
sites will trigger the webhook simultaneously. We guard against this by ensuring that only a single Lambda instance can be executed at
|
||
|
a given time.
|
||
|
|
||
|
== Prerequisites
|
||
|
|
||
|
* ROSA HCP based multi-site Keycloak deployment
|
||
|
* AWS CLI Installed
|
||
|
* AWS Global Accelerator loadbalancer
|
||
|
|
||
|
== Procedure
|
||
|
. Enable Openshift user alert routing
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
kubectl apply -f - << EOF
|
||
|
apiVersion: v1
|
||
|
kind: ConfigMap
|
||
|
metadata:
|
||
|
name: user-workload-monitoring-config
|
||
|
namespace: openshift-user-workload-monitoring
|
||
|
data:
|
||
|
config.yaml: |
|
||
|
alertmanager:
|
||
|
enabled: true
|
||
|
enableAlertmanagerConfig: true
|
||
|
EOF
|
||
|
kubectl -n openshift-user-workload-monitoring rollout status --watch statefulset.apps/alertmanager-user-workload
|
||
|
----
|
||
|
+
|
||
|
. [[aws-secret]]Decide upon a username/password combination which will be used to authenticate the Lambda webhook and create an AWS Secret storing the password
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
aws secretsmanager create-secret \
|
||
|
--name webhook-password \ # <1>
|
||
|
--secret-string changeme \ # <2>
|
||
|
--region eu-west-1 # <3>
|
||
|
----
|
||
|
<1> The name of the secret
|
||
|
<2> The password to be used for authentication
|
||
|
<3> The AWS region that hosts the secret
|
||
|
+
|
||
|
. Create the Role used to execute the Lambda.
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
FUNCTION_NAME= # <1>
|
||
|
ROLE_ARN=$(aws iam create-role \
|
||
|
--role-name ${FUNCTION_NAME} \
|
||
|
--assume-role-policy-document \
|
||
|
'{
|
||
|
"Version": "2012-10-17",
|
||
|
"Statement": [
|
||
|
{
|
||
|
"Effect": "Allow",
|
||
|
"Principal": {
|
||
|
"Service": "lambda.amazonaws.com"
|
||
|
},
|
||
|
"Action": "sts:AssumeRole"
|
||
|
}
|
||
|
]
|
||
|
}' \
|
||
|
--query 'Role.Arn' \
|
||
|
--region eu-west-1 \ #<2>
|
||
|
--output text
|
||
|
)
|
||
|
</#noparse>
|
||
|
----
|
||
|
<1> A name of your choice to associate with the Lambda and related resources
|
||
|
<2> The AWS Region hosting your Kubernetes clusters
|
||
|
+
|
||
|
. Create and attach the 'LambdaSecretManager' Policy so that the Lambda can access AWS Secrets
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
POLICY_ARN=$(aws iam create-policy \
|
||
|
--policy-name LambdaSecretManager \
|
||
|
--policy-document \
|
||
|
'{
|
||
|
"Version": "2012-10-17",
|
||
|
"Statement": [
|
||
|
{
|
||
|
"Effect": "Allow",
|
||
|
"Action": [
|
||
|
"secretsmanager:GetSecretValue"
|
||
|
],
|
||
|
"Resource": "*"
|
||
|
}
|
||
|
]
|
||
|
}' \
|
||
|
--query 'Policy.Arn' \
|
||
|
--output text
|
||
|
)
|
||
|
aws iam attach-role-policy \
|
||
|
--role-name ${FUNCTION_NAME} \
|
||
|
--policy-arn ${POLICY_ARN}
|
||
|
</#noparse>
|
||
|
----
|
||
|
+
|
||
|
. Attach the `ElasticLoadBalancingReadOnly` policy so that the Lambda can query the provisioned Network Load Balancers
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
aws iam attach-role-policy \
|
||
|
--role-name ${FUNCTION_NAME} \
|
||
|
--policy-arn arn:aws:iam::aws:policy/ElasticLoadBalancingReadOnly
|
||
|
</#noparse>
|
||
|
----
|
||
|
+
|
||
|
. Attach the `GlobalAcceleratorFullAccess` policy so that the Lambda can update the Global Accelerator EndpointGroup
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
aws iam attach-role-policy \
|
||
|
--role-name ${FUNCTION_NAME} \
|
||
|
--policy-arn arn:aws:iam::aws:policy/GlobalAcceleratorFullAccess
|
||
|
</#noparse>
|
||
|
----
|
||
|
+
|
||
|
. Create a Lambda ZIP file containing the required fencing logic
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
LAMBDA_ZIP=/tmp/lambda.zip
|
||
|
cat << EOF > /tmp/lambda.py
|
||
|
|
||
|
include::examples/generated/fencing_lambda.py[tag=fencing-start]
|
||
|
expected_user = 'keycloak' # <1>
|
||
|
secret_name = 'webhook-password' # <2>
|
||
|
secret_region = 'eu-west-1' # <3>
|
||
|
include::examples/generated/fencing_lambda.py[tag=fencing-end]
|
||
|
|
||
|
EOF
|
||
|
zip -FS --junk-paths ${LAMBDA_ZIP} /tmp/lambda.py
|
||
|
</#noparse>
|
||
|
----
|
||
|
<1> The username required to authenticate Lambda requests
|
||
|
<2> The AWS secret containing the password <<aws-secret,defined earlier>>
|
||
|
<3> The AWS region which stores the password secret
|
||
|
+
|
||
|
. Create the Lambda function.
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
aws lambda create-function \
|
||
|
--function-name ${FUNCTION_NAME} \
|
||
|
--zip-file fileb://${LAMBDA_ZIP} \
|
||
|
--handler lambda.handler \
|
||
|
--runtime python3.12 \
|
||
|
--role ${ROLE_ARN} \
|
||
|
--region eu-west-1 #<1>
|
||
|
</#noparse>
|
||
|
----
|
||
|
<1> The AWS Region hosting your Kubernetes clusters
|
||
|
+
|
||
|
. Expose a Function URL so the Lambda can be triggered as webhook
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
aws lambda create-function-url-config \
|
||
|
--function-name ${FUNCTION_NAME} \
|
||
|
--auth-type NONE \
|
||
|
--region eu-west-1 #<1>
|
||
|
</#noparse>
|
||
|
----
|
||
|
<1> The AWS Region hosting your Kubernetes clusters
|
||
|
+
|
||
|
. Allow public invocations of the Function URL
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
aws lambda add-permission \
|
||
|
--action "lambda:InvokeFunctionUrl" \
|
||
|
--function-name ${FUNCTION_NAME} \
|
||
|
--principal "*" \
|
||
|
--statement-id FunctionURLAllowPublicAccess \
|
||
|
--function-url-auth-type NONE \
|
||
|
--region eu-west-1 # <1>
|
||
|
</#noparse>
|
||
|
----
|
||
|
<1> The AWS Region hosting your Kubernetes clusters
|
||
|
+
|
||
|
. Retieve the Lambda Function URL
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
aws lambda get-function-url-config \
|
||
|
--function-name ${FUNCTION_NAME} \
|
||
|
--query "FunctionUrl" \
|
||
|
--region eu-west-1 \#<1>
|
||
|
--output text
|
||
|
</#noparse>
|
||
|
----
|
||
|
<1> The AWS region where the Lambda was created
|
||
|
+
|
||
|
.Output:
|
||
|
[source,bash]
|
||
|
----
|
||
|
https://tjqr2vgc664b6noj6vugprakoq0oausj.lambda-url.eu-west-1.on.aws
|
||
|
----
|
||
|
. In each Kubernetes cluster, configure a Prometheus Alert routing to trigger the Lambda on split-brain
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
ACCELERATOR_NAME= # <1>
|
||
|
NAMESPACE= # <2>
|
||
|
LOCAL_SITE= # <3>
|
||
|
REMOTE_SITE= # <4>
|
||
|
|
||
|
kubectl apply -n ${NAMESPACE} -f - << EOF
|
||
|
include::examples/generated/ispn-site-a.yaml[tag=fencing-secret]
|
||
|
</#noparse>
|
||
|
---
|
||
|
include::examples/generated/ispn-site-a.yaml[tag=fencing-alert-manager-config]
|
||
|
---
|
||
|
include::examples/generated/ispn-site-a.yaml[tag=fencing-prometheus-rule]
|
||
|
----
|
||
|
<1> The username required to authenticate Lambda requests
|
||
|
<2> The password required to authenticate Lambda requests
|
||
|
<3> The Lambda Function URL
|
||
|
<4> The namespace value should be the namespace hosting the Infinispan CR and the site should be the remote site defined
|
||
|
by `spec.service.sites.locations[0].name` in your Infinispan CR
|
||
|
<5> The name of your local site defined by `spec.service.sites.local.name` in your Infinispan CR
|
||
|
<6> The DNS of your Global Accelerator
|
||
|
|
||
|
== Verify
|
||
|
|
||
|
To test that the Prometheus alert triggers the webhook as expected, perform the following steps to simulate a split-brain:
|
||
|
|
||
|
. In each of your clusters execute the following:
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
kubectl -n openshift-operators scale --replicas=0 deployment/infinispan-operator-controller-manager #<1>
|
||
|
kubectl -n openshift-operators rollout status -w deployment/infinispan-operator-controller-manager
|
||
|
kubectl -n ${NAMESPACE} scale --replicas=0 deployment/infinispan-router #<2>
|
||
|
kubectl -n ${NAMESPACE} rollout status -w deployment/infinispan-router
|
||
|
</#noparse>
|
||
|
----
|
||
|
<1> Scale down the {jdgserver_name} Operator so that the next steop does not result in the deployment being recreated by the operator
|
||
|
<2> Scale down the Gossip Router deployment.Replace `$\{NAMESPACE}` with the namespace containing your {jdgserver_name} server
|
||
|
+
|
||
|
. Verify the `SiteOffline` event has been fired on a cluster by inspecting the *Observe* -> *Alerting* menu in the Openshift
|
||
|
console
|
||
|
+
|
||
|
. Inspect the Global Accelerator EndpointGroup in the AWS console and there should only be a single endpoint present
|
||
|
+
|
||
|
. Scale up the {jdgserver_name} Operator and Gossip Router to re-establish a connection between sites:
|
||
|
+
|
||
|
.Command:
|
||
|
[source,bash]
|
||
|
----
|
||
|
<#noparse>
|
||
|
kubectl -n openshift-operators scale --replicas=1 deployment/infinispan-operator-controller-manager
|
||
|
kubectl -n openshift-operators rollout status -w deployment/infinispan-operator-controller-manager
|
||
|
kubectl -n ${NAMESPACE} scale --replicas=1 deployment/infinispan-router #<1>
|
||
|
kubectl -n ${NAMESPACE} rollout status -w deployment/infinispan-router
|
||
|
</#noparse>
|
||
|
----
|
||
|
<1> Replace `$\{NAMESPACE}` with the namespace containing your {jdgserver_name} server
|
||
|
+
|
||
|
. Inspect the `vendor_jgroups_site_view_status` metric in each site. A value of `1` indicates that the site is reachable.
|
||
|
+
|
||
|
. Update the Accelerator EndpointGroup to contain both Endpoints. See the <@links.ha id="operate-site-online" /> {section} for details.
|
||
|
|
||
|
== Further reading
|
||
|
|
||
|
* <@links.ha id="operate-site-online" />
|
||
|
* <@links.ha id="operate-site-offline" />
|
||
|
|
||
|
</@tmpl.guide>
|