Semantic Routing¶
Semantic routing uses vector embeddings and cosine similarity to match actions by meaning rather than exact field values. Instead of writing rigid string comparisons, you describe the topic you care about in natural language and Acteon determines whether each action's content is semantically close enough.
How It Works¶
flowchart LR
A[Action] --> B{semantic_match rule?}
B -->|Match| C[Embed text + topic]
C --> D{similarity >= threshold?}
D -->|Yes| E[Apply rule action]
D -->|No| F[Skip rule]
B -->|No match| F - An action enters the rule engine and matches a
semantic_matchcondition - The action text (a specific field or the entire payload) is sent to the configured embedding API
- The topic description is embedded (or retrieved from cache)
- Cosine similarity is computed between the two vectors
- If similarity meets the threshold, the rule fires
Configuration¶
Server Configuration¶
[embedding]
enabled = true
endpoint = "https://api.openai.com/v1/embeddings"
model = "text-embedding-3-small"
api_key = "ENC[AES256-GCM,...]" # Use acteon-server encrypt
timeout_seconds = 10
fail_open = true # Return 0.0 on API failure
# Cache tuning
topic_cache_capacity = 10000 # Max cached topic embeddings
topic_cache_ttl_seconds = 3600 # Topic cache TTL (1 hour)
text_cache_capacity = 1000 # Max cached text embeddings
text_cache_ttl_seconds = 60 # Text cache TTL (1 minute)
Secret Management
Never store your embedding API key in plain text. Use acteon-server encrypt:
export ACTEON_AUTH_KEY="<hex-encoded 256-bit key>"
echo -n "sk-..." | acteon-server encrypt
# Output: ENC[AES256-GCM,...]
Paste the ENC[...] value into api_key in your acteon.toml.
Rule Configuration¶
rules:
- name: route-infra-alerts
priority: 5
description: "Route infrastructure alerts to the DevOps team"
condition:
semantic_match: "Infrastructure issues, server problems, outages"
threshold: 0.75
text_field: action.payload.message
action:
type: reroute
target_provider: devops-pagerduty
Parameters¶
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
semantic_match | string | Yes | — | Topic description to match against |
threshold | float | No | 0.8 | Minimum cosine similarity (0.0 to 1.0) |
text_field | string | No | — | Dot-separated field path for the text to match. When omitted, the entire action payload is stringified |
Embedding Provider Interface¶
#[async_trait]
pub trait EmbeddingEvalSupport: Send + Sync {
async fn similarity(&self, text: &str, topic: &str) -> Result<f64, RuleError>;
}
The built-in EmbeddingBridge wraps any OpenAI-compatible embedding API and adds caching, fail-open resilience, and metrics.
Caching¶
Embedding API calls are expensive and add latency. Acteon uses a two-tier cache to minimize external calls:
| Tier | What it caches | Default capacity | Default TTL |
|---|---|---|---|
| Topic cache | Embeddings for topic descriptions (from rule definitions) | 10,000 | 1 hour |
| Text cache | Embeddings for action text (from payloads) | 1,000 | 1 minute |
Topic embeddings change infrequently (only when rules change), so they get a long TTL and large capacity. Text embeddings are transient action data, so they get a short TTL.
Both caches provide thundering herd protection — concurrent requests for the same key are coalesced into a single API call.
Cache Pre-warming¶
On startup, Acteon walks all loaded rules and extracts every semantic_match topic. These topics are embedded immediately so the first requests don't pay cold-start latency.
Memory Footprint¶
Each cached embedding is a Vec<f32> whose size depends on the model dimension. For text-embedding-3-small (1536 dimensions):
| Cache | Worst case |
|---|---|
| Topic cache (10,000 entries) | ~60 MB |
| Text cache (1,000 entries) | ~6 MB |
Adjust topic_cache_capacity and text_cache_capacity for your environment.
Fail-Open Behavior¶
When fail_open = true (the default), embedding API failures return similarity 0.0 instead of propagating an error. This means semantic_match rules evaluate to false on API failure rather than killing the entire dispatch pipeline.
When fail_open = false, embedding errors propagate and the action dispatch fails.
Note
This mirrors the fail_open behavior of LLM Guardrails.
Metrics¶
Embedding cache and error metrics are exposed via the /metrics and /health endpoints when the embedding provider is configured:
{
"embedding": {
"topic_cache_hits": 1500,
"topic_cache_misses": 10,
"text_cache_hits": 800,
"text_cache_misses": 200,
"errors": 2,
"fail_open_count": 2
}
}
Monitor text_cache_misses and errors to tune your cache sizes and detect provider issues.
REST API¶
The embedding subsystem also exposes a direct similarity endpoint for testing and debugging:
See the REST API Reference for details.
Use Cases¶
Topic-Based Alert Routing¶
Route alerts to the right team based on content meaning, not exact string matches:
rules:
- name: route-infra
priority: 5
condition:
semantic_match: "Infrastructure issues, server outages, disk failures"
threshold: 0.75
text_field: action.payload.description
action:
type: reroute
target_provider: devops-pagerduty
- name: route-billing
priority: 5
condition:
semantic_match: "Billing problems, payment failures, subscription issues"
threshold: 0.75
text_field: action.payload.description
action:
type: reroute
target_provider: billing-team
Content-Based Suppression¶
Suppress noise alerts that match a known pattern:
- name: suppress-maintenance-noise
priority: 1
condition:
all:
- field: action.action_type
eq: "alert"
- semantic_match: "Scheduled maintenance, planned downtime, maintenance window"
threshold: 0.8
text_field: action.payload.message
action:
type: suppress
reason: "Matches maintenance noise pattern"
Semantic Deduplication¶
Deduplicate alerts that say the same thing in different words:
- name: dedup-similar-alerts
priority: 3
condition:
semantic_match: "Database connection pool exhausted"
threshold: 0.9
text_field: action.payload.summary
action:
type: deduplicate
ttl_seconds: 600
Combining with Other Conditions¶
semantic_match can be used inside all / any blocks alongside standard field conditions: