Telemetry design #11175

JanProvaznik · 2024-12-19T17:19:28Z

Context

Writeup of proposed telemetry implementation based on experimentation in #11084

JanKrivanek

Looks good! Thanks!!

JanKrivanek · 2024-12-23T14:05:11Z

documentation/specs/proposed/VS-OpenTelemetry.md

+
+### Security
+
+- Providing a method for creating a hook in Framework MSBuild


Suggested change

- Providing a method for creating a hook in Framework MSBuild

- Providing or/and documenting a method for creating a hook in Framework MSBuild

JanKrivanek · 2024-12-23T14:11:21Z

documentation/specs/proposed/VS-OpenTelemetry.md

+### Security
+
+- Providing a method for creating a hook in Framework MSBuild
+- document the security implications of hooking custom telemetry Exporters/Collectors in Framework


Suggested change

- document the security implications of hooking custom telemetry Exporters/Collectors in Framework

- If custom hooking solution will be used - document the security implications of hooking custom telemetry Exporters/Collectors in Framework

Since we plan to use AppDomainManager - we are using existing solution that is outside of our trust boundaries

JanKrivanek · 2024-12-23T14:15:09Z

documentation/specs/proposed/VS-OpenTelemetry.md

+### Data handling
+
+- Implement head [Sampling](https://opentelemetry.io/docs/concepts/sampling/) with the granularity of a MSBuild.exe invocation/VS instance.
+- VS Data handle tail sampling in their infrastructure not to overwhelm storage with a lot of build events.


As discussed - we should not prevent ourselves to be able to add (in future versions):

different sampling rates for different namespaces/activities

ability to configure the overal and per-namespace sampling from server side (e.g. storing it in the .msbuild folder in user profile if different then default values set from server side - this would obviously have a delay of the default sample rate # of executions)

JanKrivanek · 2024-12-23T14:17:17Z

documentation/specs/proposed/VS-OpenTelemetry.md

+
+## Looking ahead
+
+- Create a way of using a "HighPrioActivitySource" which would override sampling and initialize Collector in MSBuild.exe scenario/tracerprovider in VS.


More generaly - sample rate per Activity/namespace (higher even always or even lower or newer)

JanKrivanek · 2024-12-23T16:21:10Z

documentation/specs/proposed/VS-OpenTelemetry.md

+## Uncertainties
+
+- Configuring tail sampling in VS telemetry server side infrastructure to not overflow them with data.
+- How much head sampling.


We can just ballpark estimate some rates or possibly we can use some little statistic science behind the sample size determination: https://en.wikipedia.org/wiki/Sample_size_determination

E.g. for proportion estimation (of fairly common occurence in the builds), with not very strict confidnece (let's say 95% is awesome for us now) and margin for error (5% is very acceptable for us) and quite high population size (let's estimate # of total daily build events to be between 10M and 100M [while in fact much more close to the uppor bound]), we would be very fine with the sampling rate of 1 from 26.000

Sample table of sample size for proprtion hypothesis: https://www.research-advisors.com/images/subpage/SSTable.jpg

For more rare events (runaway builds, custom tasks etc.) we'd need to adjust apropriately to capture at least couple hundrets datapoints daily ... that should still allow for considerably small sampling rates and hence low impact on the observed builds.

Btw. this might be as well a partial answer to some below open questions around perf - if we are not able to get the perf to be sufficient for regular executions, but still quite around 'human noticable threshold' (per various UX researches ~100ms) - we might just choose to pay the cost in very low amount of cases

JanProvaznik added 2 commits December 19, 2024 18:14

write up

dc06f8d

rename, add some details

4c0605b

JanKrivanek approved these changes Dec 23, 2024

View reviewed changes

JanKrivanek reviewed Dec 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Telemetry design #11175

Telemetry design #11175

JanProvaznik commented Dec 19, 2024

JanKrivanek left a comment

JanKrivanek Dec 23, 2024

JanKrivanek Dec 23, 2024

JanKrivanek Dec 23, 2024

JanKrivanek Dec 23, 2024

JanKrivanek Dec 23, 2024

JanKrivanek Dec 23, 2024


		### Security

		- Providing a method for creating a hook in Framework MSBuild

	- Providing a method for creating a hook in Framework MSBuild
	- Providing or/and documenting a method for creating a hook in Framework MSBuild

	- document the security implications of hooking custom telemetry Exporters/Collectors in Framework
	- If custom hooking solution will be used - document the security implications of hooking custom telemetry Exporters/Collectors in Framework


		## Looking ahead

		- Create a way of using a "HighPrioActivitySource" which would override sampling and initialize Collector in MSBuild.exe scenario/tracerprovider in VS.

Telemetry design #11175

Are you sure you want to change the base?

Telemetry design #11175

Conversation

JanProvaznik commented Dec 19, 2024

Context

JanKrivanek left a comment

Choose a reason for hiding this comment

JanKrivanek Dec 23, 2024

Choose a reason for hiding this comment

JanKrivanek Dec 23, 2024

Choose a reason for hiding this comment

JanKrivanek Dec 23, 2024

Choose a reason for hiding this comment

JanKrivanek Dec 23, 2024

Choose a reason for hiding this comment

JanKrivanek Dec 23, 2024

Choose a reason for hiding this comment

JanKrivanek Dec 23, 2024

Choose a reason for hiding this comment