Visualization & Monitoring
Real-time visualization and monitoring system for debugging and production observability.
Overview
The TreeVisualizer provides comprehensive observability into behavior tree execution:
- Real-time tree state export - Complete tree structure with current status
- Execution tracing - Tick-by-tick history of node executions
- Metrics collection - Replan count, tick rate, LLM latency, failure rates
- Replan tracking - Before/after subtree diffs for LLM replanning events
- JSON export - Dashboard-ready format for WebSocket/REST integration
- Performance optimized - <5ms export overhead, non-blocking
Quick Start
Basic Setup
use igris_btree::prelude::*;
use igris_btree::visualizer::{TreeVisualizer, VisualizerConfig};
use std::sync::Arc;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Create visualizer
let visualizer = Arc::new(TreeVisualizer::new());
// Create executor with visualizer
let executor = BTreeExecutor::new()
.with_visualizer(visualizer.clone());
// Build and execute tree
let mut tree = Sequence::new("mission");
let mut context = BTreeContext::new();
let result = executor.execute(&mut tree, &mut context).await?;
// Export final snapshot
let snapshot = visualizer
.export_snapshot(&tree, &context, result.tick_count)
.await?;
// Save to file
let json = serde_json::to_string_pretty(&snapshot)?;
std::fs::write("tree_snapshot.json", json)?;
Ok(())
}
Configuration
VisualizerConfig
pub struct VisualizerConfig {
/// Enable/disable visualization
pub enabled: bool,
/// Maximum trace entries to keep (rolling window)
pub max_trace_entries: usize,
/// Maximum replan events to keep
pub max_replan_events: usize,
/// Optional filter for blackboard keys
pub blackboard_filter: Option<Vec<String>>,
/// Export frequency (0 = every tick, N = every N ticks)
pub export_frequency: u64,
/// Compute diffs between snapshots
pub compute_diffs: bool,
}
Configuration Examples
High-Frequency Monitoring
For development and debugging:
let config = VisualizerConfig {
enabled: true,
max_trace_entries: 200,
max_replan_events: 20,
export_frequency: 0, // Every tick
compute_diffs: true,
blackboard_filter: None, // All keys
};
let visualizer = Arc::new(TreeVisualizer::with_config(config));
Production Optimized
For production monitoring:
let config = VisualizerConfig {
enabled: true,
max_trace_entries: 50,
max_replan_events: 5,
export_frequency: 10, // Every 10 ticks
compute_diffs: true,
blackboard_filter: Some(vec![
"mission_task".to_string(),
"status".to_string(),
]),
};
Debug Mode
For intensive debugging:
let config = VisualizerConfig {
enabled: true,
max_trace_entries: 1000,
max_replan_events: 50,
export_frequency: 0,
compute_diffs: false, // Save CPU
blackboard_filter: None,
};
TreeSnapshot
Complete tree state at a point in time.
Structure
pub struct TreeSnapshot {
/// Timestamp in milliseconds since epoch
pub timestamp_ms: u64,
/// Current tick count
pub tick_count: u64,
/// Root node with recursive children
pub root: NodeSnapshot,
/// Current blackboard state
pub blackboard: HashMap<String, Value>,
/// Execution trace (recent entries)
pub execution_trace: Vec<ExecutionTraceEntry>,
/// Replan events
pub replan_events: Vec<ReplanEvent>,
/// Aggregated metrics
pub metrics: MetricsSummary,
}
Example JSON
{
"timestamp_ms": 1770033271626,
"tick_count": 5,
"root": {
"id": "root",
"name": "Mission",
"node_type": "Sequence",
"status": "Success",
"children": [...],
"stats": {
"tick_count": 5,
"success_count": 5,
"failure_count": 0,
"avg_execution_ms": 0.8,
"last_execution_ms": 0.7
}
},
"blackboard": {
"status": "completed",
"mission_task": "Navigate to warehouse"
},
"execution_trace": [...],
"replan_events": [],
"metrics": {
"total_replans": 0,
"avg_tick_rate": 1250.0,
"avg_llm_latency_ms": 0.0,
"watchdog_triggers": 0,
"total_ticks": 5,
"failure_rate": 0.0,
"total_execution_ms": 4.0
}
}
Node Statistics
Each node tracks execution statistics.
NodeStats
pub struct NodeStats {
/// Times this node was ticked
pub tick_count: u64,
/// Times returned Success
pub success_count: u64,
/// Times returned Failure
pub failure_count: u64,
/// Average execution time in ms
pub avg_execution_ms: f64,
/// Last execution time in ms
pub last_execution_ms: f64,
}
Use Cases
- Performance profiling: Identify slow nodes
- Reliability tracking: Monitor failure rates
- Execution patterns: Understand node behavior
- Bottleneck detection: Find optimization opportunities
Execution Tracing
Track tick-by-tick execution history.
ExecutionTraceEntry
pub struct ExecutionTraceEntry {
/// Tick number when executed
pub tick: u64,
/// Timestamp in milliseconds
pub timestamp_ms: u64,
/// Node identifier
pub node_id: NodeId,
/// Node name
pub node_name: String,
/// Result status
pub status: NodeStatus,
/// Execution duration in milliseconds
pub duration_ms: f64,
}
Recording Executions
Automatically recorded by visualizer when attached to executor:
// Automatic recording
let executor = BTreeExecutor::new()
.with_visualizer(visualizer);
// Manual recording (if needed)
visualizer.record_execution(
node_id,
node_name,
status,
duration,
tick
).await;
Example Trace
[
{
"tick": 1,
"timestamp_ms": 1770033271626,
"node_id": "root/0",
"node_name": "init",
"status": "Success",
"duration_ms": 0.1
},
{
"tick": 2,
"timestamp_ms": 1770033271628,
"node_id": "root/1",
"node_name": "navigate",
"status": "Running",
"duration_ms": 1.5
}
]
Replan Tracking
Monitor LLM replanning events.
ReplanEvent
pub struct ReplanEvent {
/// When replan occurred
pub timestamp_ms: u64,
/// Tick number
pub tick: u64,
/// Node that triggered replan
pub trigger_node_id: NodeId,
/// Reason for replanning
pub reason: String,
/// Replan attempt number
pub attempt: u32,
/// Subtree before replanning (optional)
pub before_subtree: Option<Value>,
/// Subtree after replanning (optional)
pub after_subtree: Option<Value>,
}
Recording Replans
visualizer.record_replan(
trigger_node_id,
"Primary action failed",
attempt,
before_subtree,
after_subtree,
tick
).await;
Example Event
{
"timestamp_ms": 1770033275000,
"tick": 15,
"trigger_node_id": "root/adaptive/0",
"reason": "Navigation failed - obstacle detected",
"attempt": 1,
"before_subtree": {
"type": "Action",
"name": "navigate_direct",
"tool": "navigate",
"args": {"path": "direct"}
},
"after_subtree": {
"type": "Sequence",
"name": "navigate_around",
"children": [
{"type": "Action", "name": "avoid_obstacle"},
{"type": "Action", "name": "resume_navigation"}
]
}
}
Metrics Collection
Aggregated metrics for performance monitoring.
MetricsSummary
pub struct MetricsSummary {
/// Total replan events
pub total_replans: u64,
/// Average ticks per second
pub avg_tick_rate: f64,
/// Average LLM call latency
pub avg_llm_latency_ms: f64,
/// Times watchdog was triggered
pub watchdog_triggers: u64,
/// Total ticks executed
pub total_ticks: u64,
/// Ratio of failed ticks
pub failure_rate: f64,
/// Total execution time
pub total_execution_ms: f64,
}
Updating Metrics
visualizer.update_metrics(
tick_duration,
llm_latency,
watchdog_triggered
).await;
Example Metrics
{
"total_replans": 3,
"avg_tick_rate": 850.0,
"avg_llm_latency_ms": 250.5,
"watchdog_triggers": 0,
"total_ticks": 42,
"failure_rate": 0.05,
"total_execution_ms": 49.4
}
Tree Diffs
Lightweight diffs to reduce bandwidth.
TreeDiff
pub struct TreeDiff {
/// Nodes that changed status
pub status_changes: Vec<StatusChange>,
/// Newly added nodes
pub added_nodes: Vec<NodeId>,
/// Removed nodes
pub removed_nodes: Vec<NodeId>,
/// Blackboard changes
pub blackboard_changes: HashMap<String, BlackboardChange>,
}
pub struct StatusChange {
pub node_id: NodeId,
pub old_status: NodeStatus,
pub new_status: NodeStatus,
}
pub enum BlackboardChange {
Added(Value),
Modified { old: Value, new: Value },
Removed(Value),
}
Computing Diffs
// Enable diffs in config
let config = VisualizerConfig {
compute_diffs: true,
..Default::default()
};
// Get diff since last snapshot
let diff = visualizer.compute_diff(¤t_snapshot).await;
Example Diff
{
"status_changes": [
{
"node_id": "root/2",
"old_status": "Running",
"new_status": "Success"
}
],
"added_nodes": [],
"removed_nodes": [],
"blackboard_changes": {
"status": {
"Modified": {
"old": "running",
"new": "completed"
}
}
}
}
Performance Characteristics
Export Performance
Measured timings on typical hardware:
- Simple tree (5 nodes): ~0.2ms
- Medium tree (20 nodes): ~1.5ms
- Large tree (100 nodes): ~4.8ms
- Target: <5ms for all trees
Memory Usage
- Trace entries: ~200 bytes each (default max 100 = 20KB)
- Replan events: ~500 bytes each (default max 10 = 5KB)
- Node stats: ~50 bytes per node
- Total overhead: <100KB for typical trees
CPU Overhead
- Export: ~0.1% of tick time
- Metrics update: <0.01ms
- Trace recording: <0.01ms
- Impact: Negligible on execution
Optimization Techniques
- Rolling windows: Limit history to prevent unbounded growth
- Diff computation: Send only changes
- Blackboard filtering: Export only relevant keys
- Lazy serialization: JSON created on demand
- Arc<RwLock<>>: Concurrent read access
Dashboard Integration
WebSocket Streaming
Real-time updates via WebSocket:
// Client-side example
const ws = new WebSocket('ws://localhost:8080/btree/visualize');
ws.onmessage = (event) => {
const snapshot: TreeSnapshot = JSON.parse(event.data);
// Update UI
renderTree(snapshot.root);
updateMetrics(snapshot.metrics);
appendTrace(snapshot.execution_trace);
highlightReplans(snapshot.replan_events);
};
REST API Polling
Periodic polling for snapshots:
async function pollSnapshot() {
const response = await fetch('/api/btree/snapshot');
const snapshot: TreeSnapshot = await response.json();
return snapshot;
}
// Poll every second
setInterval(async () => {
const snapshot = await pollSnapshot();
updateVisualization(snapshot);
}, 1000);
Diff Endpoint
Efficient updates via diffs:
let lastTimestamp = 0;
async function getDiff() {
const response = await fetch(
`/api/btree/diff?since=${lastTimestamp}`
);
const diff: TreeDiff = await response.json();
// Apply diff to current state
applyDiff(diff);
lastTimestamp = Date.now();
}
Visualization Components
Tree Graph
Recommended: D3.js hierarchical layout
import * as d3 from 'd3';
function renderTree(root: NodeSnapshot) {
const hierarchy = d3.hierarchy(root);
const treeLayout = d3.tree().size([800, 600]);
const nodes = treeLayout(hierarchy);
// Color by status
const colorMap = {
'Running': '#FFD700', // Yellow
'Success': '#00FF00', // Green
'Failure': '#FF0000', // Red
'Skipped': '#CCCCCC', // Gray
};
// Render nodes
svg.selectAll('circle')
.data(nodes.descendants())
.join('circle')
.attr('r', 10)
.attr('fill', d => colorMap[d.data.status]);
// Render labels
svg.selectAll('text')
.data(nodes.descendants())
.join('text')
.text(d => d.data.name);
}
Execution Timeline
Horizontal timeline showing tick progression:
function renderTimeline(trace: ExecutionTraceEntry[]) {
const svg = d3.select('#timeline');
svg.selectAll('rect')
.data(trace)
.join('rect')
.attr('x', d => d.tick * 20)
.attr('y', d => nodeIndex(d.node_id) * 30)
.attr('width', 18)
.attr('height', 25)
.attr('fill', d => statusColor(d.status))
.on('mouseover', showDetails);
}
Metrics Dashboard
Real-time metrics display:
function updateMetrics(metrics: MetricsSummary) {
document.getElementById('tick-rate').textContent =
`${metrics.avg_tick_rate.toFixed(1)} ticks/sec`;
document.getElementById('replans').textContent =
`${metrics.total_replans}`;
document.getElementById('failure-rate').textContent =
`${(metrics.failure_rate * 100).toFixed(2)}%`;
document.getElementById('llm-latency').textContent =
`${metrics.avg_llm_latency_ms.toFixed(1)}ms`;
}
Production Monitoring
Alerting
Set up alerts for critical metrics:
if snapshot.metrics.failure_rate > 0.1 {
alert("High failure rate: {:.1}%", snapshot.metrics.failure_rate * 100.0);
}
if snapshot.metrics.total_replans > 10 {
alert("Excessive replans: {}", snapshot.metrics.total_replans);
}
if snapshot.metrics.watchdog_triggers > 0 {
alert("Watchdog triggered {} times", snapshot.metrics.watchdog_triggers);
}
Logging
Export snapshots to logs:
use tracing::info;
info!(
"Tree execution snapshot: ticks={}, status={:?}, replans={}",
snapshot.tick_count,
snapshot.root.status,
snapshot.metrics.total_replans
);
Time-Series Export
Export to Prometheus, InfluxDB, etc.:
// Prometheus example
prometheus::gauge!("btree_tick_rate").set(snapshot.metrics.avg_tick_rate);
prometheus::counter!("btree_replans").increment(snapshot.metrics.total_replans);
prometheus::histogram!("btree_llm_latency_ms").observe(snapshot.metrics.avg_llm_latency_ms);
Example: Complete Monitoring Setup
use igris_btree::prelude::*;
use igris_btree::visualizer::{TreeVisualizer, VisualizerConfig};
use std::sync::Arc;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Configure visualizer
let viz_config = VisualizerConfig {
enabled: true,
max_trace_entries: 100,
max_replan_events: 10,
export_frequency: 5, // Every 5 ticks
compute_diffs: true,
blackboard_filter: Some(vec![
"status".to_string(),
"mission_task".to_string(),
]),
};
let visualizer = Arc::new(TreeVisualizer::with_config(viz_config));
// Create executor
let executor = BTreeExecutor::new()
.with_max_ticks(100)
.with_tracing(true)
.with_visualizer(visualizer.clone());
// Build tree
let llm = Arc::new(MockLlmProvider::with_navigation_plan());
let mut context = BTreeContext::new().with_llm(llm);
let mut tree = Sequence::new("monitored_mission")
.add_child(Box::new(SetBlackboard::new("init", "status", "starting")))
.add_child(Box::new(LLMPlannerNode::new("planner", "task", "plan")))
.add_child(Box::new(SubtreeLoader::new("executor", "plan")))
.add_child(Box::new(SetBlackboard::new("done", "status", "completed")));
// Execute with monitoring
let result = executor.execute(&mut tree, &mut context).await?;
// Export final snapshot
let snapshot = visualizer
.export_snapshot(&tree, &context, result.tick_count)
.await?;
// Display metrics
println!("\n=== Execution Metrics ===");
println!("Total Ticks: {}", snapshot.metrics.total_ticks);
println!("Tick Rate: {:.2} ticks/sec", snapshot.metrics.avg_tick_rate);
println!("Replans: {}", snapshot.metrics.total_replans);
println!("Failure Rate: {:.2}%", snapshot.metrics.failure_rate * 100.0);
println!("LLM Latency: {:.2}ms", snapshot.metrics.avg_llm_latency_ms);
// Save snapshot
let json = serde_json::to_string_pretty(&snapshot)?;
std::fs::write("visualization_snapshot.json", json)?;
println!("\nā Snapshot saved to visualization_snapshot.json");
Ok(())
}
Next Steps
- Dashboard Integration Guide - Build visualization dashboards
- Production Deployment - Production monitoring setup
- API Reference - Full visualizer API
Summary
Key takeaways:
- Real-time observability: Complete tree state export
- Execution tracing: Tick-by-tick history
- Metrics collection: Performance and reliability tracking
- Replan tracking: Monitor LLM replanning events
- Performance optimized: <5ms overhead
- Dashboard ready: JSON export for WebSocket/REST
- Production features: Alerting, logging, time-series export