Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: how to propagate ecto context from/to custom worker metrics? #425

Open
elvanja opened this issue Dec 5, 2024 · 3 comments
Open

Q: how to propagate ecto context from/to custom worker metrics? #425

elvanja opened this issue Dec 5, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@elvanja
Copy link
Contributor

elvanja commented Dec 5, 2024

So first up, apologies for making this a feature request kind of issue.

I have workers (GenServer) which emit custom telemetry events and use Ecto.
Spans from those workers telemetry are not connected to related Ecto spans.

Currently it looks like this (genserver setup and actual measurements omitted for brevity).
Basically we emit telemetry events from worker and attach to them to be able to emit open telemetry spans.
While this works, it does not connect ecto spans with the worker spans.
And ideally, those would be connected / parented correctly.

defmodule YAAPP.Telemetry do
  def telemetry_start(namespace, event, metadata) do
    start_time = System.monotonic_time()

    execute_telemetry_event(namespace, [event | [:start]], metadata)

    {namespace, event, start_time, metadata}
  end

  def telemetry_stop({namespace, event, start_time, metadata}, new_metadata \\ %{}) do
    metadata = Map.merge(metadata, new_metadata)
    
    execute_telemetry_event(namespace, [event | [:stop]], metadata)
  end

  defp execute_telemetry_event(namespace, event, metadata) do
    :telemetry.execute([:yaapp, namespace | event], %{}, metadata)
  end
end

defmodule YAAPP.Worker do
  use GenServer

  def handle_info(:do_work, state) do
    telemetry = YAAPP.Telemetry.telemetry_start(:worker, :do_work, %{state: state})
    result = do_ecto_stuff()
    YAAPP.Telemetry.telemetry_stop(telemetry, %{state: state, result: result})
    {:noreply, state}
  end

  defp telemetry_start(state, event) do
    YAAPP.Telemetry.telemetry_start(:indexer, event, telemetry_metadata(state))
  end
end

defmodule YAAPP.Otel do
  require OpenTelemetry.Tracer

  def setup do
    :telemetry.attach_many(
      {__MODULE__, :worker_events},
      [[:worker, :do_work, :stop]],
      &__MODULE__.handle_service_event/4,
      %{}
    )
  end

  def handle_service_event(name, metrics, metadata, other) do
    [_ | [function | _] = span_name] = Enum.reverse(name)
    span_id = span_name |> Enum.reverse() |> Enum.join(".")

    OpenTelemetry.Tracer.start_span(span_id, %{})
    |> OpenTelemetry.Span.end_span()

    OpenTelemetry.Ctx.clear()
  end
end

I tried fetching current context in YAAPP.Telemetry, e.g. via OpenTelemetry.Ctx.get_current() and OpentelemetryProcessPropagator.fetch_parent_ctx(1, :"$callers") and propagating via telemetry event metadata. But, those are empty, as expected. Tried then to wrap telemetry event in an otel span, like this:

defmodule YAAPP.Telemetry do
  # ....
  defp execute_telemetry_event(namespace, event, metadata, measurements \\ %{}) do
    OpenTelemetry.Tracer.with_span(__MODULE__) do
      metadata = metadata
                 |> Map.put(:ctx, OpenTelemetry.Ctx.get_current())
                 |> Map.put(:parent_ctx, OpentelemetryProcessPropagator.fetch_parent_ctx(1, :"$callers"))

      :telemetry.execute([:yaapp, namespace | event], measurements, metadata)
    end
  end
end

defmodule YAAPP.Otel do
  # ....

  def handle_service_event(name, metrics, metadata, other) do
    # ....

    # context fetched via metadata
    # parent_ctx is :undefined
    OpenTelemetry.Ctx.attach(metadata.ctx)    

    OpenTelemetry.Tracer.start_span(span_id, %{})
    # ....
  end
end

But, that also did not work. E.g. I still don't see correct parenting of those traces (in Grafana/Tempo for my case).

Any suggestions? 🙇🏼

@elvanja elvanja added the enhancement New feature or request label Dec 5, 2024
@bryannaegele
Copy link
Collaborator

bryannaegele commented Dec 6, 2024

def handle_service_event(name, metrics, metadata, other) do
    [_ | [function | _] = span_name] = Enum.reverse(name)
    span_id = span_name |> Enum.reverse() |> Enum.join(".")

    OpenTelemetry.Tracer.start_span(span_id, %{})
    |> OpenTelemetry.Span.end_span()

    OpenTelemetry.Ctx.clear()
  end

This is just creating a trace after the fact, so you aren't actually running a trace when your ecto queries are running. You'll have to handle the start event to start the span, attach, etc and a stop event to end the span.

If this is app code and not a library, we suggest just using otel directly. It would remove a lot of the complexity.

@elvanja
Copy link
Contributor Author

elvanja commented Dec 6, 2024

Oh, OK, that makes sense, basically need to wrap Ecto calls with span for correct parent context to kick in. Kind of like:

domain logic
start custom span
     ||
     ||
     ecto query
     ecto emits telemetry
            ||
            ||
            opentelemetry ecto captures ecto telemetry
            submits new otel span, inheriting custom span context
     ||     
     ||
end custom span

And if we want to keep using separate OTEL logic that just tracks ecto telemetry, it would need to look something like:

domain logic
     ecto query
     ecto emits telemetry
            ||
            ||
            opentelemetry ecto captures ecto telemetry
            submits new otel span, inheriting custom span context

otel tracker logic, setup:
     attach to ecto telemetry, both `:start` and `:stop`

otel tracker logic, handle start:
       ||
       ||
       start custom span

otel tracker logic, handle stop:
       ||
       ||
       submits span

If that is correct, then I'd still like to see if how this would be done without changing domain logic.
For handling ecto :start telemetry event I imagine something like:

OpenTelemetry.Tracer.start_span(span_id, %{
  attributes: %{some: data},
  start_time: :opentelemetry.timestamp()
})

And for handling ecto :stop:

OpenTelemetry.Tracer.end_span(:opentelemetry.timestamp())
OpenTelemetry.Ctx.clear()

But something is not working correctly there, likely context is not correctly set.
Is there an example somewhere on how this is done?

And one thing that came to mind: how can we be sure that handling ecto telemetry :start occurs before opentelemetry ecto's handler for the same event? Because if we can't guarantee that, then it is basically impossible to do anything but use OTEL spans directly in business logic, right?

@elvanja
Copy link
Contributor Author

elvanja commented Dec 6, 2024

we suggest just using otel directly

Trying this approach with something like:

defmodule YAAPP.Telemetry do
  def telemetry_start(namespace, event, metadata) do
    OpenTelemetry.Tracer.start_span([namespace | [event]], %{
      start_time: :opentelemetry.timestamp(),
      kind: :server
    })

    start_time = System.monotonic_time()

    execute_telemetry_event(namespace, [event | [:start]], metadata)

    {namespace, event, start_time, metadata}
  end

  def telemetry_stop({namespace, event, start_time, metadata}, new_metadata \\ %{}) do
    metadata = Map.merge(metadata, new_metadata)
    
    result = execute_telemetry_event(namespace, [event | [:stop]], metadata)

    OpenTelemetry.Tracer.end_span(:opentelemetry.timestamp())

    result
  end

  defp execute_telemetry_event(namespace, event, metadata) do
    :telemetry.execute([:yaapp, namespace | event], %{}, metadata)
  end
end

Idea is to use telemetry start/stop functions to start/stop OTEL span too. But that is also not working, at least as far as I can tell. Any pointers there?

If it helps, OpenTelemetry.Ctx.get_current() in telemetry_stop yields:

%{
  {:otel_tracer, :span_ctx} => :undefined
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants