Add docs for the OpenTelemetry tracing
Closes #31908 Signed-off-by: Martin Bartoš <mabartos@redhat.com> Co-authored-by: Alexander Schwartz <aschwart@redhat.com> Co-authored-by: Steven Hawkins <shawkins@redhat.com> Co-authored-by: Václav Muzikář <vaclav@muzikari.cz>
This commit is contained in:
parent
a84f3937b9
commit
d17a48f8f8
6 changed files with 130 additions and 2 deletions
|
@ -111,3 +111,11 @@ Now {project_name} allows configuring ECDH-ES, ECDH-ES+A128KW, ECDH-ES+A192KW or
|
|||
ifeval::[{project_community}==true]
|
||||
Many thanks to https://github.com/justin-tay[Justin Tay] for the contribution.
|
||||
endif::[]
|
||||
|
||||
= OpenTelemetry Tracing support _(Preview)_
|
||||
|
||||
The underlying Quarkus support for OpenTelemetry Tracing has been exposed to {project_name} and allows obtaining application traces for better observability.
|
||||
It helps to find performance bottlenecks, determine the cause of application failures, trace a request through the distributed system, and much more.
|
||||
The support is in preview mode, and we would be happy to obtain any feedback.
|
||||
|
||||
For more information, see the link:{tracingguide_link}[{tracingguide_name}] guide.
|
||||
|
|
|
@ -36,5 +36,6 @@ https://account.live.com/developers/applications/create
|
|||
https://developer.twitter.com/apps/
|
||||
https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#rolling-update
|
||||
https://stackapps.com/apps/oauth/register
|
||||
# Remove the following line once KC26 is released
|
||||
https://www.keycloak.org/server/bootstrap-admin-recovery
|
||||
# Remove following lines once KC26 is released
|
||||
https://www.keycloak.org/server/bootstrap-admin-recovery
|
||||
https://www.keycloak.org/server/tracing
|
|
@ -71,6 +71,8 @@
|
|||
:gettingstarted_link_latest: https://www.keycloak.org/guides#getting-started
|
||||
:highavailabilityguide_name: High Availability Guide
|
||||
:highavailabilityguide_link: https://www.keycloak.org/guides#high-availability
|
||||
:tracingguide_name: Enabling Tracing
|
||||
:tracingguide_link: https://www.keycloak.org/server/tracing
|
||||
:upgradingguide_name: Upgrading Guide
|
||||
:upgradingguide_name_short: Upgrading
|
||||
:upgradingguide_link: {project_doc_base_url}/upgrading/
|
||||
|
|
BIN
docs/guides/images/jaeger-tracing.png
Normal file
BIN
docs/guides/images/jaeger-tracing.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 78 KiB |
|
@ -16,6 +16,7 @@ fips
|
|||
management-interface
|
||||
health
|
||||
configuration-metrics
|
||||
tracing
|
||||
importExport
|
||||
vault
|
||||
all-config
|
||||
|
|
116
docs/guides/server/tracing.adoc
Normal file
116
docs/guides/server/tracing.adoc
Normal file
|
@ -0,0 +1,116 @@
|
|||
<#import "/templates/guide.adoc" as tmpl>
|
||||
<#import "/templates/kc.adoc" as kc>
|
||||
<#import "/templates/options.adoc" as opts>
|
||||
<#import "/templates/links.adoc" as links>
|
||||
<#import "/templates/profile.adoc" as profile>
|
||||
|
||||
<@tmpl.guide title="Enabling Tracing"
|
||||
preview="true"
|
||||
summary="Learn how to enable distributed tracing in {project_name}"
|
||||
includedOptions="tracing-* log-*-include-trace">
|
||||
|
||||
This {section} explains how you can enable and configure distributed tracing in {project_name} by utilizing https://opentelemetry.io/[OpenTelemetry] (OTel).
|
||||
Tracing allows for detailed monitoring of each request's lifecycle, which helps quickly identify and diagnose issues, leading to more efficient debugging and maintenance.
|
||||
|
||||
It also provides valuable insights into performance bottlenecks and can help optimize the system's overall efficiency.
|
||||
{project_name} uses a supported https://quarkus.io/guides/opentelemetry-tracing[Quarkus OTel extension] that provides smooth integration and exposure of application traces.
|
||||
|
||||
== Enable tracing
|
||||
|
||||
It is possible to enable exposing traces using the build time option `tracing-enabled` as follows:
|
||||
|
||||
<@kc.start parameters="--tracing-enabled=true"/>
|
||||
|
||||
By default, the trace exporters send out data in batches, using the `gRPC` protocol and endpoint `+http://localhost:4317+`.
|
||||
For more tracing settings, see all possible configurations below.
|
||||
|
||||
== Development setup
|
||||
|
||||
In order to see the captured {project_name} traces, the basic setup with leveraging the https://www.jaegertracing.io/[Jaeger] tracing platform might be used.
|
||||
For development purposes, the Jaeger-all-in-one can be used to see traces as easily as possible.
|
||||
|
||||
NOTE: Jaeger-all-in-one includes the Jaeger agent, an OTel collector, and the query service/UI.
|
||||
You do not need to install a separate collector, as you can directly send the trace data to Jaeger.
|
||||
|
||||
[source, bash]
|
||||
----
|
||||
podman|docker run --name jaeger \
|
||||
-p 16686:16686 \
|
||||
-p 4317:4317 \
|
||||
-p 4318:4318 \
|
||||
jaegertracing/all-in-one
|
||||
----
|
||||
|
||||
=== Exposed ports
|
||||
|
||||
* `:16686` - Jaeger UI
|
||||
* `:4317` - OpenTelemetry Protocol gRPC receiver (default)
|
||||
* `:4318` - OpenTelemetry Protocol HTTP receiver
|
||||
|
||||
You can visit the Jaeger UI on `+http://localhost:16686/+` to see the tracing information.
|
||||
The Jaeger UI might look like this with an arbitrary {project_name} trace:
|
||||
|
||||
image::jaeger-tracing.png[Jaeger UI]
|
||||
|
||||
== Traces in logs
|
||||
|
||||
When tracing is enabled, the trace information is included in the log messages of all enabled log handlers (see more in <@links.server id="logging"/>).
|
||||
It can be useful for associating log events to request execution, which might provide better traceability and debugging.
|
||||
All log lines originating from the same request will have the same `traceId` in the log.
|
||||
|
||||
The log message also contains a `sampled` flag, which relates to the sampling described below and indicates whether the span was sampled - sent to the collector.
|
||||
|
||||
The format of the log records may start as follows:
|
||||
|
||||
[source, bash]
|
||||
----
|
||||
2024-08-05 15:27:07,144 traceId=b636ac4c665ceb901f7fdc3fc7e80154, parentId=d59cea113d0c2549, spanId=d59cea113d0c2549, sampled=true WARN [org.keycloak.events] ...
|
||||
----
|
||||
|
||||
=== Hide trace info in logs
|
||||
|
||||
You can hide tracing information in specific log handlers by specifying their associated {project_name} option `log-<handler-name>-include-trace`, where `<handler-name>` is the name of the log handler.
|
||||
For instance, to disable trace info in the `console` log, you can turn it off as follows:
|
||||
|
||||
<@kc.start parameters="--tracing-enabled=true --log=console --log-console-include-trace=false"/>
|
||||
|
||||
NOTE: When you explicitly override the log format for the particular log handlers, the `*-include-trace` options do not have any effect, and no tracing is included.
|
||||
|
||||
== Sampling
|
||||
|
||||
Sampler decides whether a trace should be discarded or forwarded, effectively reducing overhead by limiting the number of collected traces sent to the collector.
|
||||
It helps manage resource consumption, which leads to avoiding the huge storage costs of tracing every single request and potential performance penalty.
|
||||
|
||||
WARNING: For a production-ready environment, sampling should be properly set to minimize infrastructure costs.
|
||||
|
||||
{project_name} supports several built-in OpenTelemetry samplers, such as:
|
||||
|
||||
<@opts.expectedValues option="tracing-sampler-type"/>
|
||||
|
||||
The used sampler can be changed via the `tracing-sampler-type` property.
|
||||
|
||||
=== Default sampler
|
||||
The default sampler for {project_name} is `traceidratio`, which controls the rate of trace sampling based on a specified ratio configurable via the `tracing-sampler-ratio` property.
|
||||
|
||||
==== Trace ratio
|
||||
The default trace ratio is `1.0`, which means all traces are sampled - sent to the collector.
|
||||
The ratio is a floating number in the range `(0,1]`.
|
||||
For instance, when the ratio is `0.1`, only 10% of the traces are sampled.
|
||||
|
||||
WARNING: For a production-ready environment, the trace ratio should be a smaller number to prevent the massive cost of trace store infrastructure and avoid performance overhead.
|
||||
|
||||
==== Rationale
|
||||
|
||||
The sampler makes its own sampling decisions based on the current ratio of sampled spans, regardless of the decision made on the parent span,
|
||||
as with using the `parentbased_traceidratio` sampler.
|
||||
|
||||
The `parentbased_traceidratio` sampler could be the preferred default type as it ensures the sampling consistency between parent and child spans.
|
||||
Specifically, if a parent span is sampled, all its child spans will be sampled as well - the same sampling decision for all.
|
||||
It helps to keep all spans together and prevents storing incomplete traces.
|
||||
|
||||
However, it might introduce certain security risks leading to DoS attacks.
|
||||
External callers can manipulate trace headers, parent spans can be injected, and the trace store can be overwhelmed.
|
||||
Proper HTTP headers (especially `tracestate`) filtering and adequate measures of caller trust would need to be assessed.
|
||||
|
||||
For more information, see the https://www.w3.org/TR/trace-context/#security-considerations[W3C Trace context] document.
|
||||
</@tmpl.guide>
|
Loading…
Reference in a new issue