Project 02 · 2025 · 12 min read

SOC-AI Agent.

An autonomous incident-triage analyst. Parses your logs. Enriches every IOC. Maps observed behavior to MITRE ATT&CK. Tells you what is happening with a reasoning chain you can read.

TypeFull stack agent

StackPython 3.11 · FastAPI · React 18

Coverage80+ ATT&CK · 30+ rules

StatusActive build

IThe 3 am problem.

A small security operations center at 3 am is one analyst, four hundred queued alerts, and a coffee. Every alert requires the same pattern of work. Pull the relevant logs. Identify the indicators. Enrich them. Decide whether what you are looking at is a real incident, a tuning problem, or background noise. Write the verdict. Move on.

The pattern is mechanical. The wall clock is not. By the time the analyst has triaged the first hundred alerts the queue has grown by two hundred more. The job is not hard. The job is volume.

The agent does the mechanical part. It does not replace the analyst. It hands the analyst a triaged queue with verdicts, evidence, and a MITRE map for each item. The analyst spends time on the items that need a person.

IIThe architecture.

The system is split into a Python backend and a React frontend. The backend ingests raw events through a small set of parsers, fans out enrichment in parallel, runs behavioral rules, performs MITRE mapping, and emits a verdict per case. The frontend renders the verdict as a live document with the reasoning visible at every step.

Communication between the two is over WebSockets, so the UI sees the triage happen in real time. Reports finalize as streaming HTML and a printable PDF, which matters for anyone who has to attach evidence to a ticket.

IIIThe parsers.

The agent ingests six formats. Each parser is small, well tested, and lives behind a uniform internal interface.

Sysmon XML, including event IDs 1, 3, 7, 10, 11, 22
Windows Event Logs in raw EVTX
Firewall logs from standard syslog formats
.eml phishing emails with header and attachment parsing
PCAP captures with conversation extraction
Raw text logs for everything else

The parser stage produces a normalized event stream. Every downstream component reads the same shape, regardless of source format. Adding a new source is one parser module. The rules, enrichment, and verdict layers do not change.

IVEnrichment.

Every IOC pulled out of the parsed events fans out to seven threat intelligence APIs concurrently. VirusTotal, AbuseIPDB, Shodan, URLScan, ThreatFox, MalwareBazaar, AlienVault OTX. Responses are aggregated into a per-IOC profile with a consensus score.

The enrichment layer reuses the same scoring discipline as IOC-Enrich. Sources are weighted by historical reliability. The aggregate maps into five tiers from CRITICAL to CLEAN. An indicator with strong multi source confirmation lights up the case immediately. An indicator with one stale reputation note quietly slots into the evidence pile.

VCorrelation against history.

Triage in isolation is brittle. The same indicator hitting you for the fifth time this week is a different signal than an indicator you have never seen before.

The agent maintains a local SQLite history of every case it has handled, indexed by indicator, host, and case metadata. When a new case lands the agent checks the history. Previously-flagged indicators, previously compromised hosts, and previously confirmed false positives all show up in the evidence panel for the human reviewer and influence the verdict weighting.

VIMITRE ATT&CK mapping.

Indicators are necessary. Behaviors are sufficient. The agent ships with an 80 technique subset of the MITRE ATT&CK matrix, focused on the techniques that actually land in production attacks. Behavior detection runs through 30 plus rules over the normalized event stream.

Each rule expresses one observable pattern. Credential dumping signatures over LSASS access events. Persistence patterns over registry writes and scheduled task creation. Lateral movement over remote process creation and SMB authentication. Each match produces an attestation that associates a specific event sequence with a specific ATT&CK technique.

The technique map is the part of the report a security engineer cares about. It is also the part that survives being read in a hurry. If three rules light up under execution and privilege escalation, the case is not a tuning issue.

VIIThe verdict.

The verdict synthesizer combines IOC severity, behavioral rule matches, historical context, and case metadata into a weighted score with a reasoning chain. The reasoning chain is not a black box. It is a sequence of explicit findings, each linked to the evidence that produced it.

The verdict is auditable by construction. Every line of reasoning ties back to an event, an indicator, or a rule match.

The output reads as a small structured document. Conclusion. Confidence. Evidence list with timestamps. ATT&CK matrix with technique IDs. Indicator enrichments with source breakdowns. Recommended next steps. The analyst reads it. The case either resolves or escalates. The agent moves to the next case.

VIIIThe dashboard.

The React frontend is the part the analyst actually looks at. It renders the case queue, the live triage stream, the evidence panels, and the historical view over the same WebSocket connection that the backend writes to. Verdicts update in real time as enrichment results come back. The user can drill into any indicator, any rule match, any historical case, without losing the active context.

The dashboard is built with Vite and Tailwind. The backend is containerized with Docker. The agent ships as a self-hostable stack. The dependencies are explicit. There is no cloud requirement except for the third party intel APIs, and those are pluggable.

IXHonest limits.

The agent is a triage layer, not a detection layer. It depends on the source telemetry being correct. If Sysmon is not deployed, the agent will not see credential dumping. If the firewall is logging into a black hole, the agent will not see lateral movement.

Rules cover known techniques. Novel attack chains that do not match any rule will land in the agent’s reports as unclassified evidence, which is the right behavior. The agent flags the unknown. It does not hide it.

The agent does not write its own rules. The 30 plus rules shipped are hand authored against the ATT&CK subset. Extending coverage is a manual exercise, and intentionally so. Auto generated detection logic is a much harder research problem and not the one this project is solving.

SHA-256 · this article

0000000000000000000000000000000000000000000000000000000000000000

verified locally · client side digest

All notes View source on GitHub