Argon2 release notes and sizing guide update

Closes #29033

Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
Signed-off-by: Alexander Schwartz <aschwart@redhat.com>
Co-authored-by: Alexander Schwartz <alexander.schwartz@gmx.net>
Co-authored-by: Václav Muzikář <vaclav@muzikari.cz>
Co-authored-by: Alexander Schwartz <aschwart@redhat.com>
This commit is contained in:
Kamesh Akella 2024-05-14 11:40:51 -04:00 committed by GitHub
parent 5cacf8637c
commit 1d613d9037
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 28 additions and 10 deletions

View file

@ -8,7 +8,7 @@ In {project_name} 24, the Welcome page is updated to use https://www.patternfly.
= Argon2 password hashing
Argon2 is now the default password hashing algorithm used by {project_name}
Argon2 is now the default password hashing algorithm used by {project_name} in a non-FIPS environment.
Argon2 was the winner of the [2015 password hashing competition](https://en.wikipedia.org/wiki/Password_Hashing_Competition)
and is the recommended hashing algorithm by [OWASP](https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html#argon2id).
@ -19,6 +19,7 @@ better security, with almost the same CPU time as previous releases of {project_
memory, which is a requirement to be resistant against GPU attacks. The defaults for Argon2 in {project_name} requires 7MB
per-hashing request.
To prevent excessive memory and CPU usage, the parallel computation of hashes by Argon2 is by default limited to the number of cores available to the JVM.
To support the memory intensive nature of Argon2, we have updated the default GC from ParallelGC to G1GC for a better heap utilization.
= New Hostname options

View file

@ -100,6 +100,23 @@ http_server_requests_seconds_sum{method="GET",outcome="SUCCESS",status="200",uri
Use the new options `http-metrics-histograms-enabled` and `http-metrics-slos` to enable default histogram buckets or specific buckets for service level objectives (SLOs).
Read more about histograms in the https://prometheus.io/docs/concepts/metric_types/#histogram[Prometheus documentation about histograms] on how to use the additional metrics series provided in `http_server_requests_seconds_bucket`.
= Argon2 password hashing
In {project_name} 24 release, we had a change in the password hashing algorithm which resulted in an increased CPU usage. To address that, we opted to a different default hashing algorithm Argon2 for non-FIPS environments which brings the CPU usage back to where it was prior to the {project_name} 24 release.
== Expected improvement in overall CPU usage and temporary increased database activity
The Concepts for sizing CPU and memory resources in the {project_name} High Availability guide have been updated to reflect the new hashing defaults.
After the upgrade, during a password-based login, the user's passwords will be re-hashed with the new hash algorithm and hash iterations as a one-off activity and updated in the database.
As this clears the user from {project_name}'s internal cache, you'll also see an increased read activity on the database level.
This increased database activity will decrease over time as more and more user's passwords have been re-hashed.
== Updated JVM garbage collection settings
To support the memory intensive nature of Argon2, we have updated the default GC from ParallelGC to G1GC for a better heap utilization.
Please monitor the JVM heap utilization closely after this upgrade. Additional tuning may be necessary depending on your specific workload.
= Limiting memory usage when consuming HTTP responses
In some scenarios like brokering Keycloak uses HTTP to talk to external servers.

View file

@ -39,11 +39,11 @@ Memory requirements increase with the number of client sessions per user session
* In containers, Keycloak allocates 70% of the memory limit for heap based memory. It will also use approximately 300 MB of non-heap-based memory.
To calculate the requested memory, use the calculation above. As memory limit, subtract the non-heap memory from the value above and divide the result by 0.7.
* For each 8 password-based user logins per second, 1 vCPU per Pod in a three-node cluster (tested with up to 300 per second).
* For each 45 password-based user logins per second, 1 vCPU per Pod in a three-node cluster (tested with up to 300 per second).
+
{project_name} spends most of the CPU time hashing the password provided by the user, and it is proportional to the number of hash iterations.
* For each 450 client credential grants per second, 1 vCPU per Pod in a three node cluster (tested with up to 2000 per second).
* For each 500 client credential grants per second, 1 vCPU per Pod in a three node cluster (tested with up to 2000 per second).
+
Most CPU time goes into creating new TLS connections, as each client runs only a single request.
@ -58,17 +58,17 @@ Performance of {project_name} dropped significantly when its Pods were throttled
Target size:
* 50,000 active user sessions
* 24 logins per seconds
* 450 client credential grants per second
* 45 logins per seconds
* 500 client credential grants per second
* 350 refresh token requests per second
Limits calculated:
* CPU requested: 5 vCPU
* CPU requested: 3 vCPU
+
(24 logins per second = 3 vCPU, 450 client credential grants per second = 1 vCPU, 350 refresh token = 1 vCPU)
(45 logins per second = 1 vCPU, 500 client credential grants per second = 1 vCPU, 350 refresh token = 1 vCPU)
* CPU limit: 15 vCPU
* CPU limit: 9 vCPU
+
(Allow for three times the CPU requested to handle peaks, startups and failover tasks)
@ -102,9 +102,9 @@ The following setup was used to retrieve the settings above to run tests of abou
* {project_name} deployed with the Operator and 3 pods in a high-availability setup with two sites in active/passive mode.
* OpenShift's reverse proxy running in passthrough mode were the TLS connection of the client is terminated at the Pod.
* Database Amazon Aurora PostgreSQL in a multi-AZ setup, with the writer instance in the availability zone of the primary site.
* Default user password hashing with PBKDF2(SHA512) 210,000 hash iterations which is the default https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html#pbkdf2[as recommended by OWASP].
* Default user password hashing with Argon2 and 5 hash iterations and minimum memory size 7 MiB https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html#argon2id[as recommended by OWASP].
* Client credential grants don't use refresh tokens (which is the default).
* Database seeded with 20,000 users and 20,000 clients.
* Database seeded with 100,000 users and 100,000 clients.
* Infinispan local caches at default of 10,000 entries, so not all clients and users fit into the cache, and some requests will need to fetch the data from the database.
* All sessions in distributed caches as per default, with two owners per entries, allowing one failing Pod without losing data.