Streaming ingestion and schema changes
Applies to: ✅ Azure Data Explorer
Cluster nodes cache the schema of databases that get data through streaming ingestion, boosting performance and resource use. However, when there are schema changes, it can lead to delays in updates.
If schema changes and streaming ingestion aren't synchronized, you can encounter failures like schema-related errors or incomplete and distorted data in the table.
This article outlines typical schema changes and provides guidance on avoiding problems with streaming ingestion during these changes.
Schema changes
The following list covers key examples of schema changes:
- Creation of tables
- Deletion of tables
- Adding a column to a table
- Removing a column from a table
- Retyping the columns of a table
- Renaming the columns of a table
- Adding precreated ingestion mappings
- Removing precreated ingestion mappings
- Adding, removing, or altering policies
Coordinate schema changes with streaming ingestion
The schema cache is kept while the database is online. If there are schema changes, the system automatically refreshes the cache, but this refresh can take several minutes. If you rely on the automatic refresh, you can experience uncoordinated ingestion failures.
You can reduce the effects of propagation delay by explicitly clearing the schema cache on the nodes. If the streaming ingestion flow and schema changes are coordinated, you can completely eliminate failures and their associated data distortion.
To coordinate the streaming ingestion flow with schema changes:
- Suspend streaming ingestion.
- Wait until all outstanding streaming ingestion requests are complete.
- Do schema changes.
- Issue one or several .clear cache streaming ingestion schema commands.
- Repeat until successful and all rows in the command output indicate success
- Resume streaming ingestion.
Note
If you've built an application for custom ingestion, we recommend managing schema-related failures by either retrying for a set duration or redirecting data from failed requests using queued ingestion methods.