Migrating from Prometheus Alertmanager¶
Acteon now covers every major Alertmanager primitive, so you can lift an existing alertmanager.yml into Acteon with a single CLI command, review the diff, and apply. This guide walks through that process, explains how each Alertmanager concept maps to Acteon, and calls out the gotchas worth knowing before you cut over.
Why migrate¶
If your Alertmanager deployment is doing exactly one job — fan-out notifications with grouping, silences, and inhibition — Acteon will feel familiar but heavier. Acteon is most useful when you want more than notification routing:
- A real audit trail for every dispatched action with structured query support and optional hash-chained tamper-evidence.
- LLM guardrails that can deny, modify, or require human approval for actions whose payloads match a freeform policy.
- Task chains — multi-step workflows that branch on previous-step output, run sub-steps in parallel, and cancel cleanly.
- Multi-provider routing with circuit breakers, fallback chains, and per-tenant quotas.
- Action signing so dispatchers can prove an action came from a trusted source and replay protection rejects duplicates.
The tradeoff is operational complexity: Acteon is a stateful gateway with its own state store and audit backend. If you only need silent fan-out, Alertmanager is simpler. If you need any of the items above, read on.
Concept mapping¶
| Alertmanager | Acteon | Notes |
|---|---|---|
route (root) | Default rule fall-through | Root receiver becomes the default reroute for actions that don't match any explicit rule. |
route.routes (children) | Individual rules: entries | Each child becomes one rule with a condition and an action. Priority follows depth-first order. |
match: { k: v } | field: action.metadata.k\neq: v | Equality match on action metadata labels. |
match_re: { k: pattern } | field: action.metadata.k\nmatches: pattern | Regex match. Anchored on Acteon side. |
group_by, group_wait, group_interval, repeat_interval | action: { type: group, ... } | Translated 1:1 by the importer. See Event Grouping. |
receivers | Entries in providers.toml | Each receiver becomes one provider. Slack receivers without an API URL fall back to a log provider with a TODO comment. |
inhibit_rules | action: { type: suppress } | Each inhibit rule becomes a suppress rule against the target matchers. Source-side cross-checking is approximated via priority ordering. |
silences (runtime) | Silences — POST /v1/silences | Identical model: label matchers, time window, hierarchical tenant inheritance. |
time_intervals / mute_time_intervals (top-level + per-route) | Time intervals — POST /v1/time-intervals + per-rule mute_time_intervals / active_time_intervals | Importer translates Alertmanager's textual weekday and month forms (monday:friday, jan:dec) into Acteon's numeric ranges. |
templates | Payload templates | Acteon uses MiniJinja, not Go's text/template. Templates are not auto-translated — you'll need to port any non-trivial templates by hand. |
| Cluster gossip (HA) | State-store sync | Acteon HA uses a shared state backend (Redis, Postgres, DynamoDB) with periodic sync intervals. No gossip protocol. |
Step-by-step migration¶
1. Run the importer¶
acteon import alertmanager \
--config /etc/alertmanager/alertmanager.yml \
--output-dir ./acteon-config \
--default-namespace prod \
--default-tenant acme
The importer prints a one-line summary to stderr and writes:
acteon-config/providers.toml— one entry per Alertmanager receiver.acteon-config/rules.yaml— one rule per route child + one rule per inhibit rule.acteon-config/time-intervals.yaml— one entry per top-leveltime_intervals:definition (only when present).
Pass --dry-run to print everything to stdout instead of writing files. That's the recommended first invocation so you can eyeball the diff before anything touches disk.
2. Review the generated providers¶
Open providers.toml and replace every TODO placeholder. The importer copies what it can from your Alertmanager global: block (SMTP host/from, Slack API URL, etc.), but anything secret-shaped (routing keys, API tokens, webhook URLs) is left as-is from your config — review and rotate them through your secret manager rather than committing them.
A few receiver families have caveats:
- VictorOps + Pushover are feature-gated in
acteon-server. The generatedproviders.tomlincludes# Requires --features ...comments. Build with the matching cargo features or the registry will refuse to start. - Slack receivers without a
slack_api_urlare translated to alogprovider so you don't end up with a broken provider after import. Either setslack_api_urlin the Alertmanagerglobal:block before re-running the import, or change the type toslackand supply a webhook URL by hand.
3. Review the generated rules¶
Open rules.yaml. Each child route becomes one rule named imported-<label-hint>-<priority>. The importer preserves:
- Priority ordering (depth-first, root first).
match→ equality conditions onaction.metadata.<label>.match_re→ anchored regex conditions.group_by+group_wait+group_interval+repeat_interval→action: { type: group, ... }.- Inherited
mute_time_intervals/active_time_intervalsfrom the enclosing route.
inhibit_rules get one suppress rule each at the bottom of the file. This is an approximation — Acteon doesn't have Alertmanager's "inhibit when source matches" cross-correlation. The generated rule will suppress all actions matching the target labels regardless of whether a source-matching alert is currently firing. Add an explicit condition (e.g., action.metadata.severity == "warning" plus a check for an existing critical via the audit trail) if you need true inhibition.
4. Review the generated time intervals¶
If your Alertmanager config has a top-level time_intervals: (or the legacy mute_time_intervals:) list, the importer also writes time-intervals.yaml. Each entry is namespaced with --default-namespace and --default-tenant. Apply them via:
for entry in $(yq '.time_intervals[] | @json' time-intervals.yaml); do
echo "$entry" | curl -X POST $ACTEON_URL/v1/time-intervals \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
--data @-
done
(A bulk-load command is on the roadmap. For now, one POST per entry.)
The importer translates Alertmanager's textual day and month forms to Acteon's numeric ranges:
| Alertmanager | Acteon |
|---|---|
monday:friday | { start: 1, end: 5 } |
saturday | { start: 6, end: 6 } |
jan:dec | { start: 1, end: 12 } |
1:31 | { start: 1, end: 31 } |
-1 | { start: -1, end: -1 } (last day of month) |
5. Apply the providers and rules¶
Move the files into the locations Acteon reads at startup:
cp acteon-config/providers.toml /etc/acteon/providers.toml
cp acteon-config/rules.yaml /etc/acteon/rules/imported.yaml
Restart the gateway. The startup logs will report the loaded provider count and rule count. Use acteon rules list (or GET /v1/rules) to confirm the rules came through.
6. Cut over traffic¶
Acteon doesn't have an Alertmanager-protocol receiver out of the box. Two cutover patterns work in practice:
- Run side-by-side with a webhook receiver. Add a
webhookreceiver to your existing Alertmanager that POSTs to$ACTEON_URL/v1/dispatch. Acteon will dispatch through your new rules; Alertmanager continues to dispatch through its old routes in parallel. Compare audit trails for a few days, then delete the old receivers from Alertmanager. - Drop Alertmanager entirely. Update your Prometheus
alerting:section to point at Acteon's/v1/dispatch/alertmanagerendpoint. (This endpoint is on the roadmap; until then, the side-by-side pattern is the safer cutover.)
What's still different¶
A handful of Alertmanager features don't map directly. None of these are showstoppers, but they're worth knowing before you cut over:
- Templates. Alertmanager's Go-template
templates:directory is not auto-translated. Acteon uses MiniJinja for payload templates. Functionally equivalent for most use cases — variable substitution, conditionals, loops — but the syntax is different and you'll need to port each template by hand. - Cluster gossip. Alertmanager uses a custom HashiCorp memberlist gossip protocol to coordinate silences and notification dedup across replicas. Acteon uses a shared state backend (Redis, Postgres, DynamoDB) with periodic sync. The default sync interval is 10 seconds for silences and 30 seconds for time intervals; silences and intervals created on one replica become visible on peers within that window.
- Inhibition correlation. Alertmanager's inhibit_rules suppress a target alert only when a source alert is currently firing. The importer approximates this with a static suppress rule on the target labels — you'll lose the source-cross-check unless you wire up an explicit lookup against the audit trail in a custom rule condition.
continue: trueon routes. Alertmanager allows a child route to fall through to its siblings whencontinue: trueis set. Acteon rules are first-match-wins. The importer ignorescontinuetoday; if your routing tree depends on it, you'll need to expand the affected branches into multiple rules by hand.- Per-receiver group_by overrides. Alertmanager lets you set
group_by: ['...']on the receiver itself. Acteon'sgrouprule action carriesgroup_by, so per-route overrides translate cleanly, but per-receiver-only overrides do not — fold them into the matching route before importing.
Worked example¶
The repo ships a complete sample at examples/alertmanager-with-time-intervals.yml that exercises every primitive the importer supports today. Run:
cargo run -p acteon-cli -- import alertmanager \
--config examples/alertmanager-with-time-intervals.yml \
--default-namespace prod \
--default-tenant acme \
--dry-run
Input¶
route:
receiver: default-slack
group_by: [alertname, cluster]
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
routes:
- match: { severity: critical }
receiver: pagerduty-oncall
mute_time_intervals:
- planned-maintenance
- match: { team: database }
receiver: opsgenie-database
active_time_intervals:
- business-hours
receivers:
- name: pagerduty-oncall
pagerduty_configs:
- routing_key: "REPLACE_WITH_PD_ROUTING_KEY"
- name: opsgenie-database
opsgenie_configs:
- api_key: "REPLACE_WITH_OPSGENIE_API_KEY"
time_intervals:
- name: business-hours
time_intervals:
- times: [{ start_time: "09:00", end_time: "17:00" }]
weekdays: ["monday:friday"]
location: "America/New_York"
Output¶
[[providers]]
name = "pagerduty-oncall"
type = "pagerduty"
routing_key = "REPLACE_WITH_PD_ROUTING_KEY"
[[providers]]
name = "opsgenie-database"
type = "opsgenie"
opsgenie.api_key = "REPLACE_WITH_OPSGENIE_API_KEY"
rules:
- name: imported-critical-1
priority: 1
condition:
field: action.metadata.severity
eq: "critical"
action:
type: reroute
target_provider: pagerduty-oncall
mute_time_intervals:
- planned-maintenance
- name: imported-database-2
priority: 2
condition:
field: action.metadata.team
eq: "database"
action:
type: reroute
target_provider: opsgenie-database
active_time_intervals:
- business-hours
time_intervals:
- name: business-hours
namespace: prod
tenant: acme
location: "America/New_York"
time_ranges:
- times:
- start: "09:00"
end: "17:00"
weekdays:
- start: 1
end: 5
The full output (5 providers, 6 rules, 2 time intervals) is what cargo run -- import alertmanager --dry-run prints when you point it at the sample.
Related reading¶
- Time intervals — the recurring schedule primitive that mirrors Alertmanager's
time_intervals. - Silences — runtime, time-bounded label mutes (the same mental model as Alertmanager silences).
- Event grouping —
group_wait,group_interval,repeat_intervalsemantics. - Suppression — the rule action
inhibit_rulestranslates into. - Audit trail — the structured log Acteon adds on top of dispatch.