Adress keycloak high-availability guide follow-up items

Closes #24975 Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
2023-11-23 15:19:11 +01:00 · 2023-11-23 15:19:11 +01:00 · 68b33be655
commit 68b33be655
parent 855aebabc2
2 changed files with 13 additions and 4 deletions
--- a/docs/guides/high-availability/concepts-memory-and-cpu-sizing.adoc
+++ b/docs/guides/high-availability/concepts-memory-and-cpu-sizing.adoc
@ -16,7 +16,9 @@ Adjust the values for your environment as needed based on your load tests.
 ====
 * Performance will be lowered when scaling to more Pods (due to additional overhead) and using a cross-datacenter setup (due to additional traffic and operations).

-* Increased cache sizes can improve the performance when {project_name} instances run for a longer time. Still, those caches need to be filled when an instance is restarted.
+* Increased cache sizes can improve the performance when {project_name} instances running for a longer time.
+This will decrease response times and reduce IOPS on the database.
+Still, those caches need to be filled when an instance is restarted, so do not set resources too tight based on the stable state measured once the caches have been filled.

 * Use these values as a starting point and perform your own load tests before going into production.
 ====
@ -85,10 +87,12 @@ The following setup was used to retrieve the settings above to run tests of abou
 * OpenShift 4.13.x deployed on AWS via ROSA.
 * Machinepool with `m5.4xlarge` instances.
 * {project_name} deployed with the Operator and 3 pods.
-* Default user password hashing with PBKDF2 27,500 hash iterations.
+* Default user password hashing with PBKDF2 27,500 hash iterations (which is the default).
+* Client credential grants don't use refresh tokens (which is the default).
 * Database seeded with 100,000 users and 100,000 clients.
 * Infinispan caches at default of 10,000 entries, so not all clients and users fit into the cache, and some requests will need to fetch the data from the database.
-* All sessions in distributed caches as per default, with two owners per entries, allowing one failing pod without losing data.
+* All sessions in distributed caches as per default, with two owners per entries, allowing one failing Pod without losing data.
+* OpenShift's reverse proxy running in passthrough mode were the TLS connection of the client is terminated at the Pod.
 * PostgreSQL deployed inside the same OpenShift with ephemeral storage.
 +
 Using a database with persistent storage will have longer database latencies, which might lead to longer response times; still, the throughput should be similar.
--- a/docs/guides/high-availability/concepts-threads.adoc
+++ b/docs/guides/high-availability/concepts-threads.adoc
@ -21,6 +21,11 @@ The Quarkus executor thread pool is configured in https://quarkus.io/guides/all-
 Depending on the available CPU cores, it can grow even larger.
 Threads are created as needed, and will end when no longer needed, so the system will scale up and down as needed.

+When running on Kubernetes, adjust the number of worker threads to avoid creating more load than what the CPU limit for the Pod to avoid throttling, which would lead to congestion.
+When running on physical machines, adjust the number of worker threads to avoid creating more load than the node can handle to avoid congestion.
+Congestion would result in longer response times and an increased memory usage, and eventually an unstable system.
+
+Ideally, you should start with a low limit of threads and adjust it accordingly to the target throughput and response time.
 When the load and the number of threads increases, the bottleneck will usually be the database connections.
 Once a request cannot acquire a database connection, it will fail with a message in the log like `Unable to acquire JDBC Connection`.
 The caller will receive a response with a 5xx HTTP status code indicating a server side error.
@ -46,7 +51,7 @@ By default, {project_name} will queue all incoming requests infinitely, even if
 This will use additional memory in the Pod, can exhaust resources in the load balancers, and the requests will eventually time out on the client side without the client knowing if the request has been processed.
 To limit the number of queued requests in {project_name}, set an additional Quarkus configuration option.

-Configure `quarkus.thread-pool.queue-size` to specify a maximum queue length to allow for effective load shedding once this queue size is exceeded: {project_name} will return HTTP Status code 500 (server error).
+Configure `quarkus.thread-pool.queue-size` to specify a maximum queue length to allow for effective load shedding once this queue size is exceeded.
 Assuming a {project_name} Pod processes around 200 requests per second, a queue of 1000 would lead to maximum waiting times of around 5 seconds.

 // KC22.0.6 - this is still 500