Ingest data with Fluent Bit into Azure Data Explorer

Fluent Bit is an open-source agent that collects logs, metrics, and traces from various sources. It allows you to filter, modify, and aggregate event data before sending it to storage. This article guides you through the process of using Fluent Bit to send data to your KQL database.

This article shows how to ingest data with Fluent Bit.

For a complete list of data connectors, see Data connectors overview.

Prerequisites

Create a Microsoft Entra service principal

The Microsoft Entra service principal can be created through the Azure portal or programmatically, as in the following example.

This service principal is the identity used by the connector to write data to your table in Kusto. You grant permissions for this service principal to access Kusto resources.

  1. Sign in to your Azure subscription via Azure CLI. Then authenticate in the browser.

    az login
    
  2. Choose the subscription to host the principal. This step is needed when you have multiple subscriptions.

    az account set --subscription YOUR_SUBSCRIPTION_GUID
    
  3. Create the service principal. In this example, the service principal is called my-service-principal.

    az ad sp create-for-rbac -n "my-service-principal" --role Contributor --scopes /subscriptions/{SubID}
    
  4. From the returned JSON data, copy the appId, password, and tenant for future use.

    {
      "appId": "00001111-aaaa-2222-bbbb-3333cccc4444",
      "displayName": "my-service-principal",
      "name": "my-service-principal",
      "password": "00001111-aaaa-2222-bbbb-3333cccc4444",
      "tenant": "00001111-aaaa-2222-bbbb-3333cccc4444"
    }
    

You've created your Microsoft Entra application and service principal.

Create a target table

Fluent Bit forwards logs in JSON format with three properties: log (dynamic), tag (string), and timestamp (datetime).

You can create a table with columns for each of these properties. Alternatively, if you have structured logs, you can create a table with log properties mapped to custom columns. To learn more, select the relevant tab.

To create a table for incoming logs from Fluent Bit:

  1. Browse to your query environment.

  2. Select the database where you'd like to create the table.

  3. Run the following .create table command:

    .create table FluentBitLogs (log:dynamic, tag:string, timestamp:datetime)
    

    The incoming JSON properties are automatically mapped into the correct column.

Grant permissions to the service principal

Grant the service principal from Create a Microsoft Entra service principal database ingestor role permissions to work with the database. For more information, see Examples. Replace the placeholder DatabaseName with the name of the target database and ApplicationID with the AppId value you saved when creating a Microsoft Entra service principal.

.add database <DatabaseName> ingestors ('aadapp=<ApplicationID>;<TenantID>')

Configure Fluent Bit to send logs to your table

To configure Fluent Bit to send logs to your table in Kusto, create a classic mode or YAML mode configuration file with the following output properties:

Field Description Required Default
Name The pipeline name. azure_kusto
tenant_id The tenant ID from Create a Microsoft Entra service principal. ✔️
client_id The application ID from Create a Microsoft Entra service principal. ✔️
client_secret The client secret key value (password) from Create a Microsoft Entra service principal. ✔️
ingestion_endpoint Enter the value as described for Ingestion_Endpoint. ✔️
database_name The name of the database that contains your logs table. ✔️
table_name The name of the table from Create a target table. ✔️
ingestion_mapping_reference The name of the ingestion mapping from Create a target table. If you didn't create an ingestion mapping, remove the property from the configuration file.
log_key Key name of the log content. For instance, log. log
tag_key The key name of tag. Ignored if include_tag_key is false. tag
include_time_key A timestamp is appended to output, if enabled. Uses the time_key property. true
time_key The key name for the timestamp in the log records. Ignored if include_time_key false. timestamp
ingestion_endpoint_connect_timeout The connection timeout of various Kusto endpoints in seconds. 60s
compression_enabled Sends compressed HTTP payload (gzip) to Kusto, if enabled. true
ingestion_resources_refresh_interval The ingestion resources refresh interval of Kusto endpoint in seconds. 3600
workers The number of workers to perform flush operations for this output. 0

To see an example configuration file, select the relevant tab:

[SERVICE]
    Daemon Off
    Flush 1
    Log_Level trace
    HTTP_Server On
    HTTP_Listen 0.0.0.0
    HTTP_Port 2020
    Health_Check On

[INPUT]
    Name tail
    Path /var/log/containers/*.log
    Tag kube.*
    Mem_Buf_Limit 1MB
    Skip_Long_Lines On
    Refresh_Interval 10

[OUTPUT]
    match *
    name azure_kusto
    tenant_id <TenantId>
    client_id <ClientId>
    client_secret <AppSecret>
    ingestion_endpoint <IngestionEndpoint>
    database_name <DatabaseName>
    table_name <TableName>
    ingestion_mapping_reference <MappingName>
    ingestion_endpoint_connect_timeout <IngestionEndpointConnectTimeout>
    compression_enabled <CompressionEnabled>
    ingestion_resources_refresh_interval <IngestionResourcesRefreshInterval>

Confirm data ingestion

  1. Once data arrives in the table, confirm the transfer of data, by checking the row count:

    FluentBitLogs
    | count
    
  2. To view a sample of log data, run the following query:

    FluentBitLogs
    | take 100