On This Page

Home / Cribl Insights/Data Insights

Data Insights

Data Insights provides an interactive topology view of your Cribl Stream data flows, showing how data moves from Sources through Pre-Processing Pipelines, Routes/QuickConnect, and Post-Processing Pipelines to Destinations, with metrics for volume, freshness, and shape. On eligible Source and Destination cards, a persistent queue (PQ) sidecar is displayed. It is designed for quick troubleshooting and validation of end-to-end flows across Worker Groups and time ranges.

Why Use Data Insights

  • Locate issues fast: Spot partial failures, bottlenecks, drops, and misconfigurations by visualizing how Sources, Pipelines, and Destinations connect. Use the map to see when a Source is not attached to the expected Route, identify stale or inactive Sources, and trace events/bytes and freshness along the flow path.
  • Validate changes: Compare metrics across time windows to confirm the impact of configuration or deployment changes.
  • Understand flow shape: See where volume changes (reduction/enrichment) occur and where the Shape metric (field count per event) grows or shrinks across map cards. Use this to spot Sources with excessive fields, Pipelines/Packs that expand or reduce fields, and unexpected field-count variance that may indicate data-quality issues.

How It Works

Topology map: Map cards (the boxes) represent Sources, Pre-Processing Pipelines, Routes/QuickConnect, Post-Processing Pipelines, and Destinations. The arrows represent flow. The following columns are displayed (left-to-right):

  • Source: Ingest points producing Events, Bytes In and Freshness In, and (when enabled) Shape, which reflects the number of fields per event.
  • Pre-Processing Pipeline: The first processing stage, where you typically parse, normalize, and filter data before routing. Expect changes in volume and Shape here as fields are extracted or dropped.
  • Routes / QuickConnect: Data routing/branching layer that directs events to Pipelines and Destinations. Use this column to verify that Sources connect to the expected paths.
  • Post-Processing Pipeline: Where transformations, enrichment, and reduction occur. You can expect divergence between Events and Bytes In vs. Out, and Shape changes as Pipelines or Packs add, modify, or remove fields.
  • Destination: Egress targets showing Events, Bytes Out, and related Freshness at the boundary.
  • Drilldown: Select a map card to open the details pane with time-series views for events, bytes, and freshness, aligned to your time range and filters. When a card has a persistent queue sidecar, you can select it to open a dedicated PQ view.

When Shape metrics are enabled, map cards can also show the minimum and maximum number of fields per event over the selected time range. This helps you see how schema complexity changes through the flow, identify Sources with excessive fields, and find Pipelines/Packs that expand or reduce fields unexpectedly.

Select the action menu (upper-right of the map card) to configure the object the map card represents, filter the view to it, or copy its IDs.

Persistent Queues

A PQ sidecar appears on Source and Destination cards when:

  • Sources: Persistent queue is enabled and the Source has a configured Queue size limit.
  • Destinations: Backpressure behavior is set to Persistent Queue and has a configured Queue size limit.

PQ utilization is calculated from:

  • Used (bytes): The persistent queue size (pq.queue_size), summed across all Workers in the Worker Group at each reporting interval, then aggregated as the peak queue usage per time bucket in charts.
  • Limit (bytes): The Queue size limit configured for the Source or Destination PQ. For Destinations, reported totals multiply this per-Worker limit by Worker count so the limit line matches aggregated usage.
  • Utilization: Persistent queue usage (%), the ratio of peak queue usage to Queue size limit, from 0% to 100%.

For background on enabling and tuning queues, see About Persistent Queues.

Caveats

  • Source and Destination volumes differ: Bytes In (Source) and Bytes Out (Destination) don’t match because Data Insights accounts for compression, formatting, and protocol overhead at the Destination. Both values are correct relative to their specific measurement points.
  • Aggregation Functions affect Source filtering: When you filter the map by Source, the Destination still displays the full aggregated volume if an Aggregation Function is used in the Pipeline. If a Function combines events and removes attribution fields, it prevents Data Insights from isolating that specific Source’s contribution.
  • Shared Pipelines and QuickConnect: If a Pipeline or QuickConnect is used by multiple Sources or Destinations, it appears as a distinct map card for each connection. This ensures that metrics remain specific to the traffic of that particular path, rather than being aggregated across all connections sharing that name.
  • PQ sidecar vs. flow metrics: The map’s main card metrics (volume, freshness, shape) follow the map time range and settings. The PQ drawer can use its own time window for the utilization chart–refresh PQ data independently when you need a narrower or wider window than the map.

Map Settings

Use these controls (top bar and sidebar) to refine what you see.

Refresh: Refreshes the map and the details pane metrics using the current time range and filters.

Worker Group: The map is limited to a specific Worker Group to focus on a subset of infrastructure. Changing the Worker Group updates the map cards and metrics displayed.

Time period: Select a time range (for example, last 15 minutes, 1 hour, 1 day). The map and details pane re-render using aggregated resolution appropriate to the range.

  • Use Compare to see how the current time range’s metrics differ from the same-length window at an earlier point in time. This is helpful to validate recent changes, confirm regressions, or distinguish new issues from normal historical patterns. Select a Comparison period to see the current metric value on each map card, with a percentage change versus the comparison period. Both periods are plotted on the same charts so you can quickly see increases, decreases, or pattern changes. The Comparison period controls how far back the earlier window starts.

Filter: Focus the map to specific components. You can filter by: Source, Routes/QuickConnect, Destination, and Metric Value (when a Metric type is selected).

  • When you select a Source, Route/QuickConnect, or Destination, the map highlights that card and renders its directly connected cards to preserve context, rather than hiding all other cards that do not strictly match the filter.
  • If you select multiple cards, the map shows the union of their focused subgraphs (everything connected to any selected card), not the intersection. This avoids empty or misleading graphs when selected cards are not connected to each other, while still narrowing the view to the most relevant parts of your topology.

Metric Controls

Below the filters, use the metric controls to choose what each map card and sparkline shows.

  • Metric: Select the type of metric to display on map cards:
    • None: Hide per-card metric values. Show only topology.
    • Volume: Show volume metrics. For example, events and bytes in and out.
    • Freshness: Show data freshness metrics (age of events) instead of volume.
    • Shape: Show schema metrics based on field count. For example, minimum and maximum number of fields per event. Use this to spot Sources with excessive fields and Pipelines or Packs that unexpectedly expand or reduce the number of fields.
  • Metric Display: Choose which values to show for metrics that have In/Out pairs (Volume and Freshness):
    • Max In/Max Out: Display the maximum value seen at input and output over the selected time range.
      • For Volume, this highlights peak events/bytes.
      • For Freshness, this highlights the stalest data.
    • Min In/Min Out: Display the minimum value seen at input and output.
      • For Volume, this shows the lowest observed rates.
      • For Freshness, this shows the freshest (lowest-latency) data.
  • Sparkline: Choose which metric the small per-card sparkline represents. For example, Max Freshness In/Out or Min Freshness In/Out. This helps you see freshness trends at a glance without opening the details pane.
  • Display active data only: When toggled on, hides cards and connections with no activity in the selected time range. Use this to declutter the map and focus on components that actually processed data.

Details Pane (card drilldown)

Select any map card to open a right-hand details pane:

Events: Time-series of events metrics for the selected component. For example, Events In/Out, totals, and maximum values appropriate to the overlay and component. Use this to verify symmetry and detect gaps or spikes.

Bytes: Time-series of bytes metrics (Bytes In/Out) for bandwidth and reduction checks. Use this to confirm compression/reduction effects and identify downstream throttling symptoms.

Freshness: Time-series of minimum and maximum freshness (age) of events across the time range. Use this to detect stale or delayed data and align freshness anomalies with component behavior.

Shape: Time-series of minimum and maximum field counts per event. Use this to see where schemas expand or contract, identify Sources with excessive fields, and spot unexpected field-count changes that can indicate data-quality or parsing issues.

What to expect:

  • The details pane respects your Worker Group, time period, and filter selections.
  • Freshness charts expose min/max variants (for example, min in, max out) that correspond to where the metric is measured in the flow.

PQ sidecar pill: Select the sidecar pill on eligible Sources and Destinations for a persistent queue view instead of the Events/Bytes/Freshness/Shape tabs. This includes a capacity summary (utilization % and used/max bytes as described under persistent queues), a PQ utilization time series (0-100% over the selected window for that chart), and an independent time window setting.

Use the top-right controls to filter the map to the selected object (filter icon) or copy its identifier (copy icon). In the bottom bar, select Configure to open the object’s configuration.

Workflows

Detect partial failures: Set Metric to Volume, filter to a specific Route or Destination, and scan for divergence between In and Out across map cards. Drill into cards with anomalies and review the details pane to localize the interruption.

Validate transformations: With Metric set to Volume, compare In vs. Out across Post-Processing Pipelines to confirm expected reduction or enrichment. Then pivot to the Events, Bytes, or Freshness tabs in the details pane to make sure behavior is stable under load and time.

Investigate staleness: Set Metric to Freshness, use Metric Display and Sparkline to focus on Max or Min In/Out, and drill into upstream components with high maximum freshness to identify where delays enter the flow.

Review persistent queue headroom: Scan Source and Destination cards for the PQ sidecar. Select the pill to open PQ mode in the details pane, set the PQ chart time window as needed, and compare utilization peaks to your configured limits.