percentilew(), percentilesw() (aggregation function)
Applies to: ✅ Azure Data Explorer ✅ Azure Monitor ✅ Microsoft Sentinel
The percentilew()
function calculates a weighted estimate for the specified nearest-rank percentile of the population defined by expr. percentilesw()
works similarly to percentilew()
. However, percentilesw()
can calculate multiple weighted percentile values at once, which is more efficient than calculating each weighted percentile value separately.
Weighted percentiles calculate percentiles in a dataset by giving each value in the input dataset a weight. In this method, each value is considered to be repeated a number of times equal to its weight, which is then used to calculate the percentile. By giving more importance to certain values, weighted percentiles provide a way to calculate percentiles in a "weighted" manner.
To calculate unweighted percentiles, see percentiles().
Note
This function is used in conjunction with the summarize operator.
Syntax
percentilew(
expr,
weightExpr,
percentile)
percentilesw(
expr,
weightExpr,
percentiles)
Learn more about syntax conventions.
Parameters
Name | Type | Required | Description |
---|---|---|---|
expr | string |
✔️ | The expression to use for aggregation calculation. |
weightExpr | long |
✔️ | The weight to give each value. |
percentile | int or long | ✔️ | A constant that specifies the percentile. |
percentiles | int or long | ✔️ | One or more comma-separated percentiles. |
Returns
Returns a table with the estimates for expr of the specified percentiles in the group, each in a separate column.
Note
To return the percentiles in a single column, see Return percentiles as an array.
Examples
Calculate weighted percentiles
Assume you repetitively measure the time (Duration) it takes an action to complete. Instead of recording every value of the measurement, you record each value of Duration, rounded to 100 msec, and how many times the rounded value appeared (BucketSize).
Use summarize percentilesw(Duration, BucketSize, ...)
to calculate the given
percentiles in a "weighted" way. Treat each value of Duration as if it was repeated BucketSize times in the input, without actually needing to materialize those records.
The following example shows weighted percentiles.
Using the following set of latency values in milliseconds:
{ 1, 1, 2, 2, 2, 5, 7, 7, 12, 12, 15, 15, 15, 18, 21, 22, 26, 35 }
.
To reduce bandwidth and storage, do pre-aggregation to the
following buckets: { 10, 20, 30, 40, 50, 100 }
. Count the number of events in each bucket to produce the following table:
let latencyTable = datatable (ReqCount:long, LatencyBucket:long)
[
8, 10,
6, 20,
3, 30,
1, 40
];
latencyTable
The table displays:
- Eight events in the 10-ms bucket (corresponding to subset
{ 1, 1, 2, 2, 2, 5, 7, 7 }
) - Six events in the 20-ms bucket (corresponding to subset
{ 12, 12, 15, 15, 15, 18 }
) - Three events in the 30-ms bucket (corresponding to subset
{ 21, 22, 26 }
) - One event in the 40-ms bucket (corresponding to subset
{ 35 }
)
At this point, the original data is no longer available. Only the number of events in each bucket. To compute percentiles from this data, use the percentilesw()
function.
For the 50, 75, and 99.9 percentiles, use the following query:
let latencyTable = datatable (ReqCount:long, LatencyBucket:long)
[
8, 10,
6, 20,
3, 30,
1, 40
];
latencyTable
| summarize percentilesw(LatencyBucket, ReqCount, 50, 75, 99.9)
Output
percentile_LatencyBucket_50 | percentile_LatencyBucket_75 | percentile_LatencyBucket_99_9 |
---|---|---|
20 | 20 | 40 |
Return percentiles as an array
Instead of returning the values in individual columns, use the percentilesw_array()
function to return the percentiles in a single column of dynamic array type.
Syntax
percentilesw_array(
expr,
weightExpr,
percentiles)
Parameters
Name | Type | Required | Description |
---|---|---|---|
expr | string |
✔️ | The expression to use for aggregation calculation. |
percentiles | int, long, or dynamic | ✔️ | One or more comma-separated percentiles or a dynamic array of percentiles. Each percentile can be an integer or long value. |
weightExpr | long |
✔️ | The weight to give each value. |
Returns
Returns an estimate for expr of the specified percentiles in the group as a single column of dynamic array type.
Examples
Comma-separated percentiles
let latencyTable = datatable (ReqCount:long, LatencyBucket:long)
[
8, 10,
6, 20,
3, 30,
1, 40
];
latencyTable
| summarize percentilesw_array(LatencyBucket, ReqCount, 50, 75, 99.9)
Output
percentile_LatencyBucket |
---|
[20, 20, 40] |
Dynamic array of percentiles
let latencyTable = datatable (ReqCount:long, LatencyBucket:long)
[
8, 10,
6, 20,
3, 30,
1, 40
];
latencyTable
| summarize percentilesw_array(LatencyBucket, ReqCount, dynamic([50, 75, 99.9]))
Output
percentile_LatencyBucket |
---|
[20, 20, 40] |