Chasse aux Embeddings

VérifiéSûr

Utilise la similarité d'embedding pour chasser les comportements liés dans l'environnement. À partir d'un résultat malveillant confirmé, identifie des clusters de comportement, des entités associées et des variations d'attaques connues. Utile pour évaluer l'ampleur d'un incident et détecter des menaces coordonnées.

Spar Skills Guide Bot
SecuriteIntermédiaire
6002/06/2026
Claude Code
#soc#hunting#clustering#investigation#embedding

Recommandé pour

Notre avis

Utilise la similarité d'embedding pour chasser des comportements liés entre entités et identifier des menaces coordonnées.

Points forts

  • Détection de comportements similaires non détectés par les règles classiques
  • Analyse multi-dimensions (entités, temporalité, techniques)
  • Exploration progressive (k croissant) adaptée au contexte SOC

Limites

  • Nécessite un MCP spécifique (DeepTempo Findings Server)
  • Dépend de la qualité des embeddings générés en amont
  • Peut générer un volume élevé de faux positifs sans filtrage
Quand l'utiliser

Lorsque vous avez un témoin malveillant confirmé et souhaitez évaluer son étendue dans l'environnement.

Quand l'éviter

Pour une investigation rapide sans source fiable d'embedding ou sans accès au serveur de findings requis.

Analyse de sécurité

Sûr
Score qualité90/100

The skill only uses read-only MCP calls to retrieve findings and embeddings for analysis. It does not instruct any destructive actions, code execution, or data exfiltration.

Aucun point d'attention détecté

Exemples

Expand from single finding
Use embedding hunt with seed finding ID 'finding_abc123' and k=50 to find similar activity. Filter by data_source=flow and min_anomaly_score=0.8.
Temporal scope assessment
Run an embedding hunt from finding 'finding_xyz' with k=100, then group results by source IP and time window to identify sub-clusters.
Technique-based refinement
For finding 'finding_001', perform nearest neighbors with k=30 and filters on MITRE technique T1566, then generate a hunt report.

name: embedding-hunt description: Pivot from one embedding to discover behavior clusters, find related activity across entities, and identify patterns that may indicate coordinated or widespread threats version: 1.0.0 author: DeepTempo tags:

  • soc
  • hunting
  • clustering
  • investigation requires:
  • mcp/deeptempo-findings-server

Embedding Hunt

Use embedding similarity to hunt for related behaviors across the environment.

When to Use

Use this skill when:

  • You have a confirmed malicious finding and want to find similar activity
  • Investigating whether a behavior is isolated or widespread
  • Looking for variations of a known attack pattern
  • Building a comprehensive view of an incident

Prerequisites

  • Access to the DeepTempo Findings Server MCP
  • A seed finding ID or embedding vector
  • Understanding of behavioral similarity concepts

Instructions

Step 1: Establish the Seed

Start with a known finding that represents the behavior you want to hunt:

get_finding(finding_id="<seed_finding_id>")

Document:

  • The behavioral pattern this finding represents
  • Key characteristics (entities, techniques, timing)
  • Why this is the hunting seed

Step 2: Expand the Search

Use nearest neighbors with increasing k values:

# Start narrow
nearest_neighbors(query="<seed_id>", k=10)

# Expand if pattern holds
nearest_neighbors(query="<seed_id>", k=50)

# Wide search for scope assessment
nearest_neighbors(query="<seed_id>", k=100)

Step 3: Analyze the Cluster

For each expansion level, analyze:

  1. Similarity Distribution: How quickly does similarity drop off?
  2. Entity Distribution: Same entity or multiple entities?
  3. Temporal Distribution: Clustered in time or spread out?
  4. Technique Consistency: Do neighbors share MITRE predictions?

Step 4: Apply Filters

Refine the hunt with filters:

# Filter by data source
nearest_neighbors(query="<seed_id>", k=50, filters={"data_source": "flow"})

# Filter by time range
nearest_neighbors(query="<seed_id>", k=50, filters={
    "time_range": {"start": "2024-01-15T00:00:00Z", "end": "2024-01-15T23:59:59Z"}
})

# Filter by minimum anomaly score
nearest_neighbors(query="<seed_id>", k=50, filters={"min_anomaly_score": 0.7})

Step 5: Identify Sub-Clusters

Look for natural groupings within results:

  • Group by source IP
  • Group by destination
  • Group by time window
  • Group by technique

Step 6: Generate Hunt Report

Document findings following the output format.

Output Format

# Embedding Hunt Report

**Seed Finding**: [Finding ID]
**Hunt Timestamp**: [Current Time]
**Status**: Requires Human Review

## Seed Behavior Summary

[Describe the behavior pattern being hunted]

### Seed Characteristics
| Attribute | Value |
|-----------|-------|
| Data Source | [source] |
| Primary Technique | [technique] |
| Anomaly Score | [score] |
| Key Entity | [entity] |

## Hunt Results

### Scope Summary

| Metric | Value |
|--------|-------|
| Total Similar Findings | [count] |
| Unique Source IPs | [count] |
| Unique Destinations | [count] |
| Unique Hostnames | [count] |
| Time Span | [duration] |

### Similarity Distribution

| Similarity Range | Count | Interpretation |
|------------------|-------|----------------|
| 0.95 - 1.00 | [n] | Near-identical behavior |
| 0.90 - 0.95 | [n] | Very similar |
| 0.80 - 0.90 | [n] | Related pattern |
| 0.70 - 0.80 | [n] | Loosely related |

### Entity Analysis

#### Affected Entities
| Entity | Finding Count | First Seen | Last Seen |
|--------|---------------|------------|-----------|
| [entity] | [count] | [time] | [time] |

#### Entity Relationships
[Describe connections between entities]

### Temporal Analysis

[Describe timing patterns:
- When did activity start?
- Is it ongoing?
- Are there bursts or steady activity?]

### Technique Distribution

| Technique | Findings | Avg Confidence |
|-----------|----------|----------------|
| [T####] | [count] | [avg] |

## Identified Clusters

### Cluster 1: [Label]
- **Findings**: [count]
- **Common Characteristic**: [description]
- **Entities**: [list]
- **Assessment**: [interpretation]

### Cluster 2: [Label]
[Repeat structure]

## Hunt Conclusions

### Pattern Assessment
[Is this isolated or widespread? Coordinated or independent?]

### Threat Assessment
[What does the scope tell us about the threat?]

### Confidence Level
[High/Medium/Low] - [Reasoning]

## Recommended Actions

### Immediate
1. [Action]

### Investigation
1. [Action]

### Monitoring
1. [Action]

---
*This report was generated by Claude using the Embedding Hunt skill.*
*All findings require human validation.*

Examples

Example 1: Hunting from Confirmed C2

Seed: Confirmed C2 beacon from compromised host Hunt Goal: Find other compromised hosts

Approach:

  1. Use seed embedding to find similar beaconing patterns
  2. Filter to exclude the seed host
  3. Group results by source IP
  4. Each unique source IP is a potential compromise

Example 2: Hunting Lateral Movement

Seed: Detected lateral movement attempt Hunt Goal: Map the full movement path

Approach:

  1. Find similar authentication/movement patterns
  2. Build timeline of activity
  3. Identify source and destination hosts
  4. Reconstruct the movement chain

Guidelines

  1. Start narrow, expand gradually - Don't overwhelm with too many results initially
  2. Document the seed clearly - Others need to understand what you're hunting
  3. Look for natural breakpoints - Similarity drop-offs indicate cluster boundaries
  4. Consider false positives - High similarity doesn't guarantee malicious
  5. Time-bound your hunt - Set reasonable time windows
  6. Validate findings - Spot-check results for relevance

Constraints

  • Do not assume all similar findings are malicious
  • Validate clusters before drawing conclusions
  • Note limitations of embedding similarity
  • Require human review for any response actions
  • Document methodology so hunts are reproducible
Skills similaires