Incident Propagation
Overview
When a service fails, the impact rarely stops at that service alone. Other services that depend on it may be affected too — even if their own sensors are perfectly healthy. The incident propagation feature makes this cascade visible on the Service Dependency Map.
How it works
Onagre traces the dependency graph starting from every service whose status is Down or Degraded. For each failing service, it follows the dependency chain to find all services that depend on it — directly or transitively.
A service is marked as Impacted when:
- Its own sensors are healthy (or unknown).
- But one or more services it depends on are failing.
The impacted service's effective status becomes Degraded to reflect the inherited risk, while its own status is preserved separately. This distinction lets you immediately tell whether a service is failing on its own or because of a dependency.
Propagation is automatic
The propagation computation runs server-side every time the dependency map is loaded. You do not need to configure rules or thresholds — Onagre computes the impact graph from your existing dependencies and sensor states.
Using propagation on the map
- Open the Service Dependency Map at Monitoring → Service Map.
- Click the Show propagation button in the toolbar.
- The map updates with visual indicators:
Visual indicators
| Indicator | Meaning |
|---|---|
| Red glowing border | Root-cause service — the origin of the failure. Its own sensors are down or degraded. |
| Orange dashed border | Impacted service — healthy on its own, but a dependency is failing. |
| Orange solid edge | Propagation path — the edge through which the failure cascades. |
Enriched node details
When propagation is active, clicking a node shows additional information in the popover:
Impacted nodes display:
- An Impacted warning label.
- The impact source — the name of the root-cause service.
- The active incident count on the service.
Root-cause nodes display:
- The downstream impact count — how many services are transitively affected by this failure.
Click the Show propagation button again to disable the overlay and return to the standard view.
Example scenario
Consider three services with the following dependency chain:
Web App → API → Database
Web App depends on API, which depends on Database.
If the Database service goes down:
- Database is shown with a red glowing border (root cause). Its popover shows "Downstream impact: 2 services".
- API is shown with an orange dashed border (impacted). Its popover shows "Impact source: Database".
- Web App is shown with an orange dashed border (impacted). Its popover shows "Impact source: Database".
- Both edges in the chain are highlighted in orange, tracing the full propagation path.
This lets you immediately identify the Database as the root cause, rather than investigating each service individually.
Diamond dependencies
Propagation also handles complex topologies. Consider:
┌──→ Service B ──┐
App ──┤ ├──→ Service D
└──→ Service C ──┘
If Service D goes down, all three upstream services (B, C, and App) are marked as impacted, and all propagation edges are highlighted.
Cycle handling
If your dependency graph contains cycles (e.g. A → B → A), propagation handles them gracefully without infinite loops. Each node is visited only once per root cause.
Summary
| Aspect | Details |
|---|---|
| Access | Monitoring → Service Map → Show propagation button |
| Root cause | Red glowing border, shows downstream impact count |
| Impacted | Orange dashed border, shows impact source name |
| Propagation path | Orange solid edges tracing the failure chain |
| Computation | Automatic, server-side, based on dependencies and sensor states |
| Topology support | Linear chains, diamonds, cycles — all handled |