diff --git a/docs/reference/edot-collector/config/custom-data-streams.md b/docs/reference/edot-collector/config/custom-data-streams.md new file mode 100644 index 00000000000..71bda035c16 --- /dev/null +++ b/docs/reference/edot-collector/config/custom-data-streams.md @@ -0,0 +1,114 @@ +--- +navigation_title: Custom data stream routing +description: Customize data stream routing in EDOT. Learn scenarios, patterns, and risks when modifying data_stream.namespace or data_stream.dataset. +applies_to: + stack: + serverless: + observability: + product: + edot_collector: ga +products: + - id: cloud-serverless + - id: observability + - id: edot-collector +--- + +# Custom data stream routing with EDOT + +{{edot}} (EDOT) uses opinionated defaults for data stream naming to ensure compatibility with Elastic dashboards, {{product.apm}} visualizations, and curated UIs. While most use cases rely on these defaults, EDOT also supports advanced dynamic routing. + +:::{warning} +We strongly recommend not changing the default data stream names. Customizing data stream routing diverges from the standard ingestion model and there's no guarantee it will be valid for future versions. +::: + +## When to customize data streams + +The only recommended use case for customizing data stream routing is to separate data by environment (for example: dev, staging, and prod). + +A data stream name follows this structure: + +``` +-- +``` + +We recommend changing only `data_stream.namespace`, not `data_stream.dataset`. + +### The `namespace` field + +The `namespace` is intended as the configurable part of the name. Elastic dashboards, detectors, and UIs support multiple namespaces automatically. + +### The `dataset` field + +Only modify `dataset` if it's absolutely necessary and you're aware of the tradeoffs. Changing the `dataset` value can cause: + +- Dashboards and {{product.apm}} views to fail to load +- Any other content pack that you end up installing to fail +- Loss of compatibility with built-in correlations and cross-linking +- Inconsistent field mappings +- Proliferation of data streams and increased shard counts +- Incompatibility with OpenTelemetry content packs, which are required to visualize OpenTelemetry data stored natively as OpenTelemetry semantic conventions + +## Configuration example + +To enable dynamic data stream routing: + +1. Use a `resource` processor to set the desired `namespace` or `dataset` from resource attributes. +2. Add the processor to your pipeline. + +When using the default `otel` mapping mode, the exporter appends `.otel` to the `data_stream.dataset` value automatically. + +:::{note} +The example is purely illustrative, with no guarantee of it being production ready. +::: + +```yaml +exporters: + elasticsearch/otel: + api_key: ${env:ELASTIC_API_KEY} + endpoints: [${env:ELASTIC_ENDPOINT}] + +processors: + resource/env-namespace: + attributes: + - key: data_stream.namespace + from_attribute: k8s.namespace.name + action: upsert + +service: + pipelines: + metrics/otel: + processors: + - batch + - resource/env-namespace + exporters: + - elasticsearch/otel +``` + +### Valid data stream names + +Any dynamic value used in `data_stream.namespace` or `data_stream.dataset` must comply with {{es}} index naming rules: + +- Lowercase only +- No spaces +- Must not start with `_` +- Must not contain: `"`, `\`, `*`, `,`, `<`, `>`, `|`, `?`, `/` +- Avoid hyphens in environment names (use `produs` instead of `prod-us`) + +Invalid names prevent data stream creation. + +### Risks and limitations + +This configuration diverges from the standard ingestion model. Be aware of the following: + +- Future EDOT versions may not support this configuration or may introduce breaking changes. +- Changes might lead to an increase in data streams and shard counts. +- Dashboards and UIs may not recognize non-standard datasets. +- OpenTelemetry content packs may not work with custom datasets. These content packs are required to visualize OpenTelemetry data stored natively as OpenTelemetry semantic conventions. Install content packs from the {{kib}} Integrations UI by searching for `otel`. +- Some data streams might fail to be created if there are non-allowed characters in the values set for `data_stream.namespace` or `data_stream.dataset`. + +Use this feature only when necessary and validate in non-production environments first. + +## Additional resources + +- [Data stream routing reference](docs-content://solutions/observability/apm/opentelemetry/data-stream-routing.md) +- [EDOT Collector configuration examples](/reference/edot-collector/config/index.md) diff --git a/docs/reference/edot-collector/config/index.md b/docs/reference/edot-collector/config/index.md index fe584d7629c..2912ee683dc 100644 --- a/docs/reference/edot-collector/config/index.md +++ b/docs/reference/edot-collector/config/index.md @@ -24,5 +24,6 @@ The following pages provide insights into the default configurations of the EDOT * [Configure profiles collection](/reference/edot-collector/config/configure-profiles-collection.md) * [Authentication methods](/reference/edot-collector/config/authentication-methods.md) * [Tail-based sampling](/reference/edot-collector/config/tail-based-sampling.md) +* [Custom data stream routing](/reference/edot-collector/config/custom-data-streams.md) diff --git a/docs/reference/edot-collector/toc.yml b/docs/reference/edot-collector/toc.yml index bc9ff435eaf..16d8a57cdcf 100644 --- a/docs/reference/edot-collector/toc.yml +++ b/docs/reference/edot-collector/toc.yml @@ -11,6 +11,7 @@ toc: - file: config/tail-based-sampling.md - file: config/authentication-methods.md - file: config/configure-profiles-collection.md + - file: config/custom-data-streams.md - file: components/migrate-components.md - file: config/proxy.md - file: components.md