Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify whether resource envelopes containing no telemetry points are valid OTLP #598

Open
isaaczinda opened this issue Nov 13, 2024 · 12 comments
Labels
help wanted Extra attention is needed

Comments

@isaaczinda
Copy link

I would like the OTLP spec to specify whether “empty telemetry envelopes” are valid OTLP. Some examples would be a ResourceMetrics with no ScopeMetrics inside it, or a ResourceMetrics with no Metric inside it. This question is not limited to metrics; it also applies to logs and spans.

I'm interested in this question because I'm building a telemetry filtering and transformation pipeline. If empty envelopes aren't allowed, I'll drop them and log an error. If they are allowed, I'll allow them to pass through.

Why an Empty Envelope May be Useful

The attributes service.name, k8s.pod.name, and k8s.cluster.name are all stored on the Resource by convention. This information on its own, independent of any telemetry signal, can be quite useful. For example, you could use it to understand how many pods are in each service.

I can imagine a customer filtering out all or most of their telemetry signal for cost reasons but wanting to keep the resource information. With some de-duplication in the collector, this could provide a high-level picture of one’s cloud estate at a minimal egress cost. It's possible that something similar could be achieved with Entities, but Resources are still widely used for this purpose, and it's not always straightforward (or possible) to adjust existing instrumentation.

Prior Art

The filterprocessor appears to take the perspective that if there are no telemetry points inside an envelope, the envelope should be deleted (read the code here).

@isaaczinda isaaczinda changed the title Are resource envelopes with no telemetry points inside them valid OTLP? Specify whether resource envelopes containing no telemetry points are valid OTLP Nov 13, 2024
@tigrannajaryan
Copy link
Member

The proto spec unfortunately doesn't say what's expected. I think it should. Our best bet at the moment is likely to examine implementations and provided that the implementations behave mostly similarly we should specify that behavior here in this repo.

If all/most other components behave like filterprocessor then we should take that as the defacto spec. Our goal here should be to break as little existing code and existing observers of OTLP as possible.

The attributes service.name, k8s.pod.name, and k8s.cluster.name are all stored on the Resource by convention. This information on its own, independent of any telemetry signal, can be quite useful. For example, you could use it to understand how many pods are in each service.

Use cases like this are still likely better served by the Entities. Entity events are precisely that: independently from metrics/traces/logs they indicate presence of things like k8s pods, nodes, clusters, etc.

We should not modify the spec to contradict existing implementations just to serve this use case.

@jmacd
Copy link
Contributor

jmacd commented Nov 14, 2024

I thought I would add some examples from the code base.

The core OTLP exporter will export such data.

The core batch processor will NOT export such data.

The exporter batcher will export such data.

I would say that "envelopes" with no telemetry points are valid OTLP; the real question is whether they can be dropped by processors that filter and aggregate. My personal attitude would say that these envelopes can be dropped. @tigrannajaryan would you point us to the specification work about how entities will be encoded in OTLP?

@isaaczinda
Copy link
Author

isaaczinda commented Nov 14, 2024

I would say that "envelopes" with no telemetry points are valid OTLP; the real question is whether they can be dropped by processors that filter and aggregate.

I'm a bit confused by this. If message X is "valid OTLP" but all processors that filter and aggregate are allowed to drop it by default, how valid is it really? In other words, shouldn't valid OTLP be respected by all components? Of course, we could configure certain components to drop empty envelopes (e.g. by way of an OTTL function is_empty). But I think that's different than saying that components can freely drop this sort of data by default.

@isaaczinda
Copy link
Author

isaaczinda commented Nov 14, 2024

@tigrannajaryan @jmacd how can I help out here? Would it be helpful for me to audit all core components and see how they treat empty envelopes?

@tigrannajaryan
Copy link
Member

@tigrannajaryan would you point us to the specification work about how entities will be encoded in OTLP?

Here is the data model, and the corresponding prototype in OTLP. This is preliminary, subject to change, we are actively iterating on it.

@tigrannajaryan
Copy link
Member

I'm a bit confused by this. If message X is "valid OTLP" but all processors that filter and aggregate are allowed to drop it by default, how valid is it really?

A possible approach is this: an empty envelope may be considered valid AND be required to be interpreted as NOOP. In that case it is perfectly fine to drop it, since delivering or dropping a NOOP payload is functionally equivalent.

We would not design it this way in the first place, but if this is what happens in reality then we can describe this behavior in the spec and I don't think it would be totally weird.

I suggest that we don't rush this and take stock of implementations.

how can I help out here? Would it be helpful for me to audit all core components and see how they treat empty envelopes?

Absolutely. It would be great to do some spelunking in the Collector and in language SDKs to understand how we interpret empty envelopes when we receive them, and also whether we have any senders that send empty envelopes.

@jmacd
Copy link
Contributor

jmacd commented Nov 15, 2024

I'm a bit confused by this. If message X is "valid OTLP" but all processors that filter and aggregate are allowed to drop it by default, how valid is it really?

In my thinking, the empty request is valid because it is well formed, but it contains no spans/logs/metric points, so it is immediately a success and there is no reason to send an empty request except to test a connection. I shouldn't have said "dropped". I could have said "successfully received, declared success". My interpretation comes from agreeing with the batch processor's approach, which can incorporate the empty request into a batch with no change of data; if the batch processor is correct, then returning immediate success for an empty envelope is also correct.

@jmacd
Copy link
Contributor

jmacd commented Nov 15, 2024

Additional notes:

The OTel-Arrow exporter will eliminate the empty envelopes as part of its optimization process.

The groupbyattrs processor appears to eliminate empty envelopes.

@isaaczinda
Copy link
Author

isaaczinda commented Nov 22, 2024

I investigated whether various OpenTelemetry implementations drop telemetry containing a resource but no scopes or data points. For brevity, I’m referring to this situation as “empty envelopes.”

@jmacd @tigrannajaryan let me know if there are additional areas that I should do an audit of. For example, I could look into all of the contrib components if that would be useful.

SDKs

I don’t think it’s possible to easily produce empty envelopes using the OTel SDKs. I’m sure this could be done with a custom processor, but I don’t see a way to do it with out-of-the-box parts. (Note that I only checked the Go SDK, but I assume the SDKs are similar enough that these findings generalize).

Core Collector Components

I did an audit of every core receiver, processor, and exporter.

component name are empty envelopes dropped or not? code link
batchprocessor dropped code link
memorylimitedprocessor not dropped
otlphttpexporter dropped because exporterbatcher drops empty envelopes, and is enabled by default
otlpexporter dropped because exporterbatcher drops empty envelopes, and is enabled by default
debugexporter dropped because exporterbatcher drops empty envelopes, and is enabled by default
otlpreceiver dropped because the Export method drops empty envelopes both gRPC and HTTP receivers drop empty envelopes

Across the board, meta monitoring of the processing of empty envelopes is misleading. This is because most telemetry tracks the item count, which is the number of spans / metric points / log records in the payload. Empty envelopes always have an item count of 0, so their processing is often not recorded.

By default, any exporter created with the exporterhelper.New[Metrics|Logs|Traces] helper will drop empty envelopes by default, because the exporterbatches does this.

@tigrannajaryan
Copy link
Member

Thanks for the research @isaaczinda

Given the current state of implementations I think a possible way forward is to modify the spec to say:

  • Senders SHOULD not create empty envelopes (OTLP payloads that contain zero spans, zero metric points or zero log records).
  • Receivers MAY ignore empty envelopes.
  • Implementations that receive and send (forward) OTLP payloads MAY drop empty envelopes.

I deliberately avoided MUSTs since we don't want to make any existing implementations illegal. The purpose of the change is to make sure recipients don't rely on the existence of empty envelopes.

The use case that you have (k8s attributes, etc) should be served by entities in the future.

Thoughts?

@isaaczinda
Copy link
Author

From my perspective as an OTel user, this clarification is exactly what I need! This wording makes it clear that empty envelopes can exist, but there's no guarantee whether they'll be propagated or not. Consequently, building any telemetry pipeline that depends on empty envelopes would be foolish.

One thought: I've been using the word "empty envelopes" for brevity, but I imagine in the actual spec you'll want to say something like "OTLP with no log record, span, or metric data point inside it." It's a bit wordy, but I think this type of specificity is definitely helpful :)

@tigrannajaryan tigrannajaryan added the help wanted Extra attention is needed label Dec 12, 2024
@tigrannajaryan
Copy link
Member

This can move forward with a PR that makes there relevant changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants