Skip to content

How to Fine Tune Application Insights Sampling

Photo by Tima Miroshnichenko: https://www.pexels.com/photo/graph-displayed-on-laptop-screen-7567486/

Sampling is a feature in Application Insights that reduces amount of data send into App Insights. This means that not all the logs/dependencies/requests etc. are send into Application Insights. Adaptive sampling is enabled by default in all the latest versions of the Application Insights ASP.NET and ASP.NET Core Software Development Kits (SDKs), and Azure Functions. We will later learn what this adaptive sampling does, but basically it tries to protect that no important data is lost even when app is under heavy stress.

Telemetry Throttling

Why do we need sampling? If your application is sending lots of telemetry data (dependencies, logs, traces…) the Azure might start to have problems receiving all the data and your App Insights starts throttling. This can be of course avoided by limiting the amount of data we send. There are few ways in which App Insights might throttle.

First is daily cap. You can configure the amount of data App Insights can receive per day. If you exceed this amount the App Insights will stop receiving data (and eventually it will start throwing exception about it). Another reason for throttling might be the amount of events. App Insights can handle 32,000 events/second and if you exceed this, some of the data will be lost. You can read more about App Insights service limits from Azure Monitor service limits page.

SDK Sampling

App Insights SDK’s supports three different kind of sampling types:

  1. Adaptive Sampling
  2. Fixed-Rate Sampling
  3. Ingestion Sampling

Let’s dig through what these means to gain a better understanding how data sampling in App Insights works.

Adaptive sampling

Adaptive sampling is the hardest one to understand. Adaptive sampling is a dynamic sampling technique used to manage the volume of telemetry data sent into the Application Insights. Unlike fixed-rate sampling, where a constant percentage of data is sampled, adaptive sampling adjusts the sampling rate in real-time based on factors like traffic patterns and resource utilization. This approach ensures that important telemetry data is captured during high-impact scenarios while reducing the data volume during periods of lower activity. During periods of high activity or when specific conditions are met (e.g., errors or high CPU usage), adaptive sampling can increase the sampling rate to ensure that important telemetry data is not missed. This is useful for capturing data related to performance bottlenecks, errors, or other critical events.

Problem with adaptive sampling is that it doesn’t neccessary give you clear image of what is happening inside your app. For example if you want to investigate amount of dependencies during “silent hours”, you might get false data. Still I think in most of the cases adaptive sampling is good choice. Adaptive sampling is also the default value in all latest App Insights SDK’s.

Fixed-Rate Sampling

In fixed-rate sampling, a constant or fixed percentage of telemetry data is selected and sent to the service, regardless of the overall volume of data generated by your application. This is in contrast to dynamic or adaptive sampling, where the sampling rate may vary based on factors such as traffic or resource usage. For example, you might configure a fixed-rate sampling of 10%, meaning that 10% of the telemetry data generated by your application will be sent to Application Insights.

It’s important to note that with fixed-rate sampling, you are trading off some level of data detail for cost savings. Since only a fixed percentage of data is ingested, you may not capture every telemetry event or log message.

Ingestion Sampling

Ingestion sampling works on the Application Insights service endpoint. It only applies when no other sampling is in effect. If the SDK samples your telemetry, ingestion sampling is disabled. When you enable ingestion sampling, Application Insights randomly selects a subset of the telemetry data generated by your application for ingestion. This means that not all telemetry data is sent to the service; instead, a representative sample is collected.

Cost Control: By reducing the amount of data ingested and stored, ingestion sampling helps control the cost associated with using Application Insights. You are billed based on the volume of data ingested and the storage duration, so reducing the volume of data can lead to cost savings.

How to Prevent Sampling

One of the most common questions I get from Applcation Insights is that, how can I prevent App Insights from sampling my data. As shown in text above the sampling can occur for multiple reasons and if you for example hit the throttling, then data is automatically “sampled”. If you want to completely disable sampling through App Insights SDK (config file), you should configure fixed-rate sampling with a constant sampling rate of 100%.

Disable adaptive sampling from ApplicationInsights.config by removing or commenting out the AdaptiveSamplingTelemetryProcessor node. This forces App Insights to use fixed-rate sampling and disables the sampling.

Comment or remove the AdaptiveSamplingTelemetryProcessor to avoid data sampling

Am I Been Sampled?

How can I know if my data has been sampled? Well there is an easy way to query sampling percentage from Log Analytics Workspace (or Application Insights logs tab). Run following KQL query and if you see any other values than 100, then your data is sampled.

union requests,dependencies,pageViews,browserTimings,exceptions,traces
| where timestamp > ago(24h)
| summarize RetainedPercentage = 100/avg(itemCount) by bin(timestamp, 1h), itemType
| order by itemType, timestamp asc
100% of the events are retained when sampling is disabled.

My Error Messages are Sampled!

You might have a feeling, or even proof that App Insights is sampling your errors/exceptions, even though that it should preserve these important messages. One thing to note here is, that if you write error level log messages without exceptions they are actually traces. If you don’t add exception into Log.Error, by default it is just a trace. This can lead into situation that heavy sampling is tossing away your precious Log.Errors. Utilize the transaction search feature in App Insights to determine whether your Log.Error is categorized as a trace or an exception.

Error level log write can be a trace or an exception depending on is there exception added into log write

Summary

Sampling is a important feature in Application Insights. It’s a good way to reduce telemetry traffic, data costs, and storage costs, while preserving a statistically correct analysis of application data. Sampling helps to avoid Application Insights throttling your telemetry. Sometimes the sampling can be too aggressive or the default settings are not suitable for your use case. Then we can use the applicationinsights.config file to configure the sampling rates/types. If you want to completely avoid the sampling, disable adaptive sampling and use fixed rate sampling with 100% of data retained. Just make sure, that your app is not sending too much data so that it will get throttled.

Remember to check my previous blog post for more tips and tricks about Application Insights.