Monitor performance with query store

APPLIES TO: Azure Database for PostgreSQL - Flexible Server

Query store is a feature in Azure Database for PostgreSQL flexible server that provides a way to track query performance over time. Query store simplifies the troubleshooting of performance issues by helping you quickly find the longest running and most resource-intensive queries. Query store automatically captures a history of queries and runtime statistics, and retains them for your review. It slices the data by time so that you can see temporal usage patterns. Data for all users, databases, and queries is stored in a database named azure_sys in the Azure Database for PostgreSQL flexible server instance.

Enable query store

Query store is available to use with no extra charges. It's an opt-in feature, so it isn't enabled by default on a server. Query store can be enabled or disabled globally for all databases on a given server and can't be turned on or off per database.

Important

Do not enable query store on Burstable pricing tier as it would cause performance impact.

Enable query store in Azure portal

  1. Sign in to the Azure portal and select your Azure Database for PostgreSQL flexible server instance.
  2. Select Server parameters in the Settings section of the menu.
  3. Search for the pg_qs.query_capture_mode parameter.
  4. Set the value to top or all, depending on whether you want to track top-level queries or also nested queries (the ones which execute inside a function or procedure), and select Save. Allow up to 20 minutes for the first batch of data to persist in the azure_sys database.

Enable query store wait sampling

  1. Search for the pgms_wait_sampling.query_capture_mode parameter.
  2. Set the value to all and Save.

Information in query store

Query store consists of two stores:

  1. A runtime stats store for persisting the query execution statistics information.
  2. A wait stats store for persisting wait statistics information.

Common scenarios for using query store include:

  • Determining the number of times a query was executed in a given time window.
  • Comparing the average execution time of a query across time windows to see large variations.
  • Identifying longest running queries in the past few hours.
  • Identifying top N queries that are waiting on resources.
  • Understanding nature of the waits for a particular query.

To minimize space usage, the runtime execution statistics in the runtime stats store are aggregated over a fixed, configurable time window. The information in these stores can be queried using views.

Access query store information

Query store data is stored in the azure_sys database on your Azure Database for PostgreSQL flexible server instance. The following query returns information about queries which were recorded in query store:

SELECT * FROM  query_store.qs_view;

And this query returns information about waiting statistics:

SELECT * FROM  query_store.pgms_wait_sampling_view;

Find wait queries

Wait event types combine different wait events into buckets by similarity. Query store provides the wait event type, specific wait event name, and the query in question. Being able to correlate this wait information with the query runtime statistics means you can gain a deeper understanding of what contributes to query performance characteristics.

Here are some examples of how you can gain more insights into your workload using the wait statistics in query store:

Observation Action
High lock waits Check the query texts for the affected queries and identify the target entities. Look in query store for other queries which are executed frequently and/or have high duration and are modifying the same entity. After identifying these queries, consider changing the application logic to improve concurrency, or use a less restrictive isolation level.
High buffer IO waits Find the queries with a high number of physical reads in query store.
High memory waits Find the top memory consuming queries in query store. These queries are probably delaying further progress of the affected queries.

Configuration options

When query store is enabled, it saves data in aggregation windows of length determined by the pg_qs.interval_length_minutes server parameter (defaults to 15 minutes). For each window, it stores up to 500 distinct queries per window. Attributes that distinguish the uniqueness of each query are user_id (identifier of the user who executes the query), db_id (identifier of the database in whose context the query executes), and query_id (an integer value uniquely identifying the query executed). If the number of distinct queries reaches 500 during the configured interval, 5% of the ones that are recorded are deallocated to make room for more. The ones deallocated first are the ones which were executed the least number of times.

The following options are available for configuring Query Store parameters:

Parameter Description Default Range
pg_qs.interval_length_minutes (*) Capture interval in minutes for query store. Defines the frequency of data persistence. 15 1 - 30
pg_qs.is_enabled_fs Internal Use Only: This parameter is used as a feature override switch. If it shows as off, query store is disabled, despite the value set for pg_qs.query_capture_mode. on on, off
pg_qs.max_plan_size Maximum number of bytes saved from query plan text by query store; longer plans are truncated. 7500 100 - 10000
pg_qs.max_query_text_length Maximum query length that can be saved; longer queries are truncated. 6000 100 - 10000
pg_qs.parameters_capture_mode Whether and when to capture query positional parameters. capture_parameterless_only capture_parameterless_only, capture_first_sample
pg_qs.query_capture_mode Statements to track. none none, top, all
pg_qs.retention_period_in_days Retention period window in days for query store. Older data is automatically deleted. 7 1 - 30
pg_qs.store_query_plans Whether query plans should be saved in query store. off on, off
pg_qs.track_utility Whether query store must track utility commands. on on, off

(*) Static server parameter which requires a server restart for a change in its value to take effect.

The following options apply specifically to wait statistics:

Parameter Description Default Range
pgms_wait_sampling.history_period Frequency, in milliseconds, at which wait events are sampled. 100 1 - 600000
pgms_wait_sampling.is_enabled_fs Internal Use Only: This parameter is used as a feature override switch. If it shows as off, wait sampling is disabled despite the value set for pgms_wait_sampling.query_capture_mode. on on, off
pgms_wait_sampling.query_capture_mode Which statements the pgms_wait_sampling extension must track. none none, all

Note

pg_qs.query_capture_mode supersedes pgms_wait_sampling.query_capture_mode. If pg_qs.query_capture_mode is none, the pgms_wait_sampling.query_capture_mode setting has no effect.

Use the Azure portal to get or set a different value for a parameter.

Views and functions

You can query the information recorded by query store and or delete it using some views and functions available in the query_store schema of the azure_sys database. Anyone in the PostgreSQL public role can use these views to see the data in query store. These views are only available in the azure_sys database.

Queries are normalized by looking at their structure and ignoring anything not semantically significant, like literals, constants, aliases, or differences in casing.

If two queries are semantically identical, even if they use different aliases for the same referenced columns and tables, they're identified with the same query_id. If two queries only differ in the literal values used in them, they're also identified with the same query_id. For queries identified with the same query_id, their sql_query_text is that of the query that executed first since query store started recording activity, or since the last time the persisted data was discarded because the function query_store.qs_reset was executed.

How query normalization works

Following are some examples to try to illustrate how this normalization works:

Say that you create a table with the following statement:

create table tableOne (columnOne int, columnTwo int);

You enable Query Store data collection, and a single or multiple users execute the following queries, in this exact order:

select * from tableOne;
select columnOne, columnTwo from tableOne;
select columnOne as c1, columnTwo as c2 from tableOne as t1;
select columnOne as "column one", columnTwo as "column two" from tableOne as "table one";

All the previous queries share the same query_id. And the text that Query Store keeps is that of the first query executed after enabling data collection. Therefore, it would be select * from tableOne;.

The following set of queries, once normalized, don't match the previous set of queries because the WHERE clause makes them semantically different:

select columnOne as c1, columnTwo as c2 from tableOne as t1 where columnOne = 1 and columnTwo = 1;
select * from tableOne where columnOne = -3 and columnTwo = -3;
select columnOne, columnTwo from tableOne where columnOne = '5' and columnTwo = '5';
select columnOne as "column one", columnTwo as "column two" from tableOne as "table one" where columnOne = 7 and columnTwo = 7;

However, all queries in this last set share the same query_id and the text used to identify them all is that of the first query in the batch select columnOne as c1, columnTwo as c2 from tableOne as t1 where columnOne = 1 and columnTwo = 1;.

Finally, find below some queries not matching the query_id of the ones in the previous batch, and the reason why they don't:

Query:

select columnTwo as c2, columnOne as c1 from tableOne as t1 where columnOne = 1 and columnTwo = 1;

Reason for not matching: List of columns refers to the same two columns (columnOne and ColumnTwo), but the order in which they're referred is reversed, from columnOne, ColumnTwo in the previous batch to ColumnTwo, columnOne in this query.

Query:

select * from tableOne where columnTwo = 25 and columnOne = 25;

Reason for not matching: Order in which the expressions evaluated in the WHERE clause are referred is reversed from columnOne = ? and ColumnTwo = ? in the previous batch to ColumnTwo = ? and columnOne = ? in this query.

Query:

select abs(columnOne), columnTwo from tableOne where columnOne = 12 and columnTwo = 21;

Reason for not matching: The first expression in the column list isn't columnOne anymore, but function abs evaluated over columnOne (abs(columnOne)), which isn't semantically equivalent.

Query:

select columnOne as "column one", columnTwo as "column two" from tableOne as "table one" where columnOne = ceiling(16) and columnTwo = 16;

Reason for not matching: The first expression in the WHERE clause doesn't evaluate the equality of columnOne with a literal anymore, but with the result of function ceiling evaluated over a literal, which isn't semantically equivalent.

Views

query_store.qs_view

This view returns all the data that is persisted in the supporting tables of query store. Data that is still recording in-memory for the currently active time window, isn't visible until the time window comes to an end, and its in-memory volatile data is collected and persisted to tables stored on disk. This view returns a different row for each distinct database (db_id), user (user_id), and query (query_id).

Name Type References Description
runtime_stats_entry_id bigint ID from the runtime_stats_entries table.
user_id oid pg_authid.oid OID of user who executed the statement.
db_id oid pg_database.oid OID of database in which the statement was executed.
query_id bigint Internal hash code, computed from the statement's parse tree.
query_sql_text varchar(10000) Text of a representative statement. Different queries with the same structure are clustered together; this text is the text for the first of the queries in the cluster. The default value for the maximum query text length is 6000, and can be modified using query store parameter pg_qs.max_query_text_length. If the text of the query exceeds this maximum value, it's truncated to the first pg_qs.max_query_text_length bytes.
plan_id bigint ID of the plan corresponding to this query.
start_time timestamp Queries are aggregated by time windows. Server parameter pg_qs.interval_length_minutes defines the time span of those windows (default is 15 minutes). This column corresponds to the start time of the window in which this entry was recorded.
end_time timestamp End time corresponding to the time window for this entry.
calls bigint Number of times the query executed in this time window. Notice that for parallel queries, the number of calls for each execution corresponds to 1 for the backend process that drives the execution of the query, plus as many other units for each backend worker process which launches to collaborate executing the parallel branches of the execution tree.
total_time double precision Total query execution time, in milliseconds.
min_time double precision Minimum query execution time, in milliseconds.
max_time double precision Maximum query execution time, in milliseconds.
mean_time double precision Mean query execution time, in milliseconds.
stddev_time double precision Standard deviation of the query execution time, in milliseconds.
rows bigint Total number of rows retrieved or affected by the statement. Notice that for parallel queries, the number of rows for each execution corresponds to the number of rows returned to the client by the backend process that drives the execution of the query, plus the sum of all rows that each backend worker process, launched to collaborate executing the parallel branches of the execution tree, returns to the backend process that drives the execution of the query.
shared_blks_hit bigint Total number of shared block cache hits by the statement.
shared_blks_read bigint Total number of shared blocks read by the statement.
shared_blks_dirtied bigint Total number of shared blocks dirtied by the statement.
shared_blks_written bigint Total number of shared blocks written by the statement.
local_blks_hit bigint Total number of local block cache hits by the statement.
local_blks_read bigint Total number of local blocks read by the statement.
local_blks_dirtied bigint Total number of local blocks dirtied by the statement.
local_blks_written bigint Total number of local blocks written by the statement.
temp_blks_read bigint Total number of temp blocks read by the statement.
temp_blks_written bigint Total number of temp blocks written by the statement.
blk_read_time double precision Total time the statement spent reading blocks, in milliseconds (if track_io_timing is enabled, otherwise zero).
blk_write_time double precision Total time the statement spent writing blocks, in milliseconds (if track_io_timing is enabled, otherwise zero).
is_system_query boolean Determines whether role with user_id = 10 (azuresu) executed the query. That user has superuser privileges and is used to perform control plane operations. Since this service is a managed PaaS service, only Microsoft is part of that superuser role.
query_type text Type of operation represented by the query. Possible values are unknown, select, update, insert, delete, merge, utility, nothing, undefined.
search_path text Value of search_path set at the time the query was captured.
query_parameters text Text representation of a JSON object with the values passed to the positional parameters of a parameterized query. This column only populates its value in two cases: 1) for nonparameterized queries. 2) For parameterized queries, when pg_qs.parameters_capture_mode is set to capture_first_sample, and if query store can fetch the values for the parameters of the query at execution time.
parameters_capture_status text Type of operation represented by the query. Possible values are succeeded (either the query wasn't parameterized or it was a parameterized query and values were successfuly captured), disabled (query was parameterized but, parameters weren't captured because pg_qs.parameters_capture_mode was set to capture_parameterless_only), too_long_to_capture (query was parameterized, but parameters weren't captured because the length of the resulting JSON that would be surfaced in the query_parameters column of this view, was considered excessively long for query store to persist), too_many_to_capture (query was parameterized, but parameters weren't captured because the total number of parameters, were considered excessive for query store to persist), serialization_failed (query was parameterized, but at least one of the values passed as a parameter couldn't be serialized to text).

query_store.query_texts_view

This view returns query text data in Query Store. There's one row for each distinct query_sql_text.

Name Type Description
query_text_id bigint ID for the query_texts table
query_sql_text varchar(10000) Text of a representative statement. Different queries with the same structure are clustered together; this text is the text for the first of the queries in the cluster.
query_type smallint Type of operation represented by the query. In version of PostgreSQL <= 14, possible values are 0 (unknown), 1 (select), 2 (update), 3 (insert), 4 (delete), 5 (utility), 6 (nothing). In version of PostgreSQL >= 15, possible values are 0 (unknown), 1 (select), 2 (update), 3 (insert), 4 (delete), 5 (merge), 6 (utility), 7 (nothing).

query_store.pgms_wait_sampling_view

This view returns wait events data in Query Store. This view returns a different row for each distinct database (db_id), user (user_id), query (query_id), and event (event).

Name Type References Description
start_time timestamp Queries are aggregated by time windows. Server parameter pg_qs.interval_length_minutes defines the time span of those windows (default is 15 minutes). This column corresponds to the start time of the window in which this entry was recorded.
end_time timestamp End time corresponding to the time window for this entry.
user_id oid pg_authid.oid Object identifier of user who executed the statement.
db_id oid pg_database.oid Object identifier of database in which the statement was executed.
query_id bigint Internal hash code, computed from the statement's parse tree.
event_type text The type of event for which the backend is waiting.
event text The wait event name if backend is currently waiting.
calls integer Number of times the same event was captured.

Note

For a list of possible values in the event_type and event columns of the query_store.pgms_wait_sampling_view view, refer to the official documentation of pg_stat_activity and look for the information referring to columns with the same names.

query_store.query_plans_view

This view returns the query plan that was used to execute a query. There's one row per each distinct database ID, and query ID. Query store only records query plans for nonutility queries.

Name Type References Description
plan_id bigint The hash value from the normalized query plan produced by EXPLAIN. It's in normalized form because it excludes the estimated costs of plan nodes and usage of buffers.
db_id oid pg_database.oid OID of database in which the statement was executed.
query_id bigint Internal hash code, computed from the statement's parse tree.
plan_text varchar(10000) Execution plan of the statement given costs=false, buffers=false, and format=text. Identical output as the one produced by EXPLAIN.

Functions

query_store.qs_reset

This function discards all statistics gathered so far by query store. It discards the statistics for already closed time windows, which are already persisted to on-disk tables. It also discards the statistics for the current time window, which only exist in-memory. Only members of the server admin role (azure_pg_admin) can execute this function.

query_store.staging_data_reset

This function discards all statistics gathered in-memory by query store (that is, the data in memory that isn't flushed yet to the on disk tables supporting persistence of collected data for query store). Only members of the server admin role (azure_pg_admin) can execute this function.

Read-only mode

When an instance of Azure Database for PostgreSQL Flexible Server is in read-only mode, such as when the default_transaction_read_only parameter is set to on, or if read-only mode is automatically enabled due to reaching storage capacity, query store doesn't capture any data.

Enabling query store on a server that has read replicas, doesn't automatically enable query store on any of the read replicas. Even if you enable it on any of the read replicas, query store doesn't record the queries executed on any read replicas, because they operate in read-only mode until you promote them to primary.