I've inherited four detection programs in the last eight years. All four had the same problem on day one. Detection engineering wasn't a function, it was a person. One or two rules-savvy analysts inside the SOC, writing Sigma between triage shifts, maintaining a SIEM project nobody else touched. That's a staffing decision, not a function.
It's why those programs had stalled at the same maturity level for two to three years while the cloud accounts, SaaS apps, and identity providers kept multiplying around them.
Detection engineering should run as a systematic practice for designing, testing, deploying, and maintaining detection logic across the environment. It needs a backlog, a lifecycle, acceptance criteria for what ships, and formal interfaces with the teams that feed it and consume its output.
The teams I've worked with that actually ship coverage have a dedicated detection engineering function. The ones still operating through informal arrangements break the moment the key person takes PTO or leaves.
This piece walks through what a real detection engineering function owns, how to organize it, the lifecycle it runs, the metrics that prove the work is happening, and where to start if you're rebuilding from Level 0.
In brief:
- Detection engineering is a systematic function with a backlog, lifecycle, acceptance criteria, and formal cross-team interfaces. Treating it as "a person who writes rules" caps the program at the maturity level of that one contributor.
- A common mature model places detection engineering inside SecOps but separate from SOC triage. That separation of building from operating is the structural milestone that makes everything else possible.
- ATT&CK technique coverage of 48-55% coexists with near-complete tactic coverage at 13 of 14 tactics. If you're reporting at the tactic level to the board, you're overstating actual detection capability by a wide margin.
- The single highest-leverage structural change for most teams is formalizing the feedback loop from incident response back to the detection backlog. Without it, every successful attacker technique stays undetected until someone decides to fix it informally.
What detection engineering actually is
Detection engineering gets described as "writing rules" or "tuning the SIEM." That's a tactic, not a function. Detection engineering, the way I run it, is the discipline of designing, testing, deploying, maintaining, and retiring detection logic so the SOC has measurable, durable coverage of the threats it actually cares about.
The unit of work is a coverage decision, not a rule. What ships is a detection portfolio with version history, tests, and a retirement record, rather than a folder of YAML.
Many programs that call themselves detection engineering are rule-writing teams with the wrong job title. A rule-writing team can ship coverage but can't sustain it. The lifecycle is what makes the function durable. Peer review, version control, regression tests, and retirement records all survive the rule's original author.
A real detection engineering function owns more than rule writing
A mature detection engineering function owns four areas:
- Telemetry strategy: deciding what to collect and validating that collection is actually happening.
- Use-case backlog management: prioritizing what to detect next based on structured inputs, not whoever was loudest in Slack.
- Detection-as-Code pipelines: version control, automated testing, CI/CD deployment across the SIEM, XDR, and EDR stack.
- Continuous tuning: tracking false positive rates per rule and retiring detections that no longer earn their slot.
These aren't SOC triage, incident response, or threat intel production. Conflating them is how detection work gets perpetually deprioritized. Upstream, the function takes structured input from threat intel, red and purple team exercises, ATT&CK heatmaps, and incident response postmortems.
Downstream, the function produces detections ready for operational use, investigation guidance attached to alerts, contextual enrichment that improves triage, and coverage maps reported at the MITRE ATT&CK technique level. CTI handoffs need the metadata necessary to generate a detection, not a spreadsheet of IOCs. SpecterOps makes that case well. Without that structured handoff, intelligence stays in reports and never becomes coverage.
Org design matters more than one smart person
The operating model I trust places detection engineering inside SecOps but separate from SOC triage. Separating the building function from the operating function is the structural change that makes backlogs, sprint cadences, and acceptance criteria possible. Dennis Chow at UKG calls this a fusion center model, sitting upstream of SOC and IR and reporting within one or two levels of the CISO.
When detection engineering runs as a shared service instead of a dedicated team, the function needs explicit ownership, requirements management, and ongoing operational support to survive. Without those conditions, shared-service detection engineering devolves into a ticket queue with no accountability.
The teams I've run use short sprint cadences to create a planning horizon, and we enforce common acceptance criteria for a "shippable" detection: peer review, documented true positive testing, and enrichment context attached before release.
Securelist's writeup on detection engineering programs notes that when SOC engineers handle deployment instead of detection engineers, deployment errors creep in because of differences in the data models the two teams use. Mature implementations keep detection engineering on the hook for the full lifecycle through deployment, with log onboarding as the specified boundary exception.
The lifecycle runs from hypothesis to retirement
The lifecycle my team runs goes from hypothesis through design, test, lab and production, deploy, monitor, tune, and retire. Each phase has distinct inputs and gates.
1. Hypothesis and design
Hypotheses come from threat intelligence, ATT&CK coverage gaps surfaced by tools like ATT&CK Navigator, and purple team exercise results. Before anyone writes a line of detection logic, design starts by verifying that the necessary log sources are being collected, parsed, and normalized.
Brandon Lyons frames this as the detection engineering baseline: establish a statistical baseline for normal behavior before rule writing begins, and document why the chosen thresholds make sense for the specific environment.
2. Testing in three layers
Testing runs in three layers. Unit tests validate rule logic against sample log events, both malicious and benign. Integration tests use adversary emulation, with Atomic Red Team for the common cases and MITRE Caldera for the more complex scenarios, to generate real telemetry and confirm the rule fires in a lab environment that mirrors production.
Regression tests, like the Scythe sigma-regression-testing pipeline, run mapped emulations to verify that modified rules still detect correctly after changes.
3. Deployment and retirement
Deployment runs through CI/CD, with Palantir's ADS Framework setting the review checkpoint model. Peer review is mandatory, true positive testing is required before production promotion, and each rule gets a tracking ID for governance. Retiring rules is the part most teams skip, and it's the part that makes the rest of the work legible.
I run quarterly content reviews to identify rules targeting decommissioned systems, rules superseded by better detections, and rules that haven't fired in defined periods. A silently deleted rule looks identical to a rule that never worked, so I version every retirement with a change log and an explicit deprecation note.
Detection-as-Code makes the function reproducible
Version control, code review, and automated testing are what distinguish a function from a single person's institutional knowledge. My team writes detections in Sigma rule format and converts them to SPL, KQL, or other platform-specific languages during the build process.
FPT Software's pipeline shows the same pattern: Sigma rules plus Sigma CLI and pySigma, wrapped in CI/CD to manage SIEM rule deployment across multiple environments, so you ship one rule across two SIEMs without rewriting either side.
Where AI fits in detection engineering today
AI shows up in detection workflows as a copilot, not a replacement. The SANS 2025 Detection and Response Survey shows mixed confidence in AI/ML tools, with some respondents rating them highly effective and others rating them not effective. In my workflow, the role is generating candidate rules for human review, suggesting enrichment fields based on the log schema, and helping prioritize backlog items against observed activity.
Framing AI as "writes all your detections" misrepresents both the state of the technology and the judgment required to ship a detection that performs well in a specific environment.
Metrics prove the function is shipping
The KPIs my team tracks: ATT&CK technique coverage percentage, false positive rate per detection, time from TTP identification to deployed detection, and the percentage of alerts shipped with required investigation context attached.
A 2024 academic analysis of commercial EDR detection from Gang Wang's research group at Illinois found technique-level coverage ranging from 48% to 55%, while tactic-level coverage was near-complete at 13 of 14 tactics. If you're reporting ATT&CK coverage at the tactic level to your board, you're overstating actual detection capability by a wide margin.
On the teams I've rebuilt, the full cycle from identifying the need for a detection to deploying it in production has run anywhere from a week on a tight team to over a month on a stalled one. That delta is your baseline for measuring improvement.
Business-facing metrics translate detection quality into outcomes that the CISO can present. Reduced analyst triage time per alert (driven by better enrichment context, not fewer alerts through suppression), reduced false positive rates across the detection portfolio, and detection health percentage are the three I lead with.
CardinalOps draws the right line: a rule can count toward coverage percentage while its underlying log source has stopped ingesting data. Coverage counts whether the rule exists; health verifies the rule is actually running, with parsers intact and logs flowing.
The anti-patterns are predictable
Several failure modes recur across every program I've seen:
- Rule bloat: detection technical debt accumulates as environments evolve and rule dependencies break.
- Silent failure: detections appear active but have stopped working due to schema changes, broken parsers, or ingestion gaps.
- No feedback loop from investigations: IR lessons learned get documented in reports that have no formal connection to the detection backlog.
- Unmanaged backlogs: engineers self-select work based on personal interest or recency bias instead of structured prioritization.
- Over-reliance on vendor-provided content without customization for the environment.
- The one-smart-person bottleneck: the Detection Engineering Maturity Matrix characterizes detection quality as depending "greatly on the understanding of the individual performing the work." That's an organizational structure problem, not a personnel quality problem.
Where to start depends on where you are
If you're at Level 0, with no process, no backlog, and a single contributor, start by inventorying every deployed detection rule with its last modified date and observed false positive rate. Create a minimal backlog with required fields (source, technique, priority score, data source requirements, owner, status).
Write a minimum viable charter that defines scope, intake process, prioritization criteria, and production approval gates. Those two artifacts, the inventory and the backlog, are the structural foundation everything else builds on.
If you're at Level 1, with some process, reactive maintenance, and no cross-functional input, the highest-return move is formalizing the IR-to-detection feedback mechanism. After every confirmed incident, document what detection gap enabled the attacker and route that directly to the backlog as a prioritized item.
This is the single highest-leverage structural change for most teams. Without it, every successful attacker technique stays undetected until someone decides to fix it informally.
If you're at Level 2, with a defined process, some metrics, and a scaling backlog, implement detection reliability monitoring to catch silent failures. Formalize end-of-life processes where retired rules get versioned change logs and explicit deprecation criteria rather than quiet deletion.
The detection programs I've rebuilt that lasted past my tenure had the structural components written down before I left. A backlog, a lifecycle, acceptance criteria, formal input channels from CTI and IR, and metrics that prove the work is happening. The ones I left without those things were back to one-smart-person inside six months.
Frequently asked questions about detection engineering
How many people do you need for a detection engineering function?
Team size matters less than the structural components: a backlog, a charter, acceptance criteria for what ships, and formal input channels from CTI and IR. Larger enterprises field bigger dedicated teams, but a two-person team with a managed backlog and review cadence will outperform a larger team operating ad hoc.
Should detection engineering report to the SOC or operate as a shared service?
A common mature model is a dedicated team inside SecOps, but separate from SOC triage. If the function operates as a shared service, it needs clearly defined ownership, service expectations, and sufficient resources for day-to-day detection upkeep. Without those, shared-service detection engineering devolves into an unowned ticket queue.
What's the difference between ATT&CK technique coverage and detection health?
Technique coverage counts how many ATT&CK techniques have at least one mapped detection rule. Detection health measures how many of those rules are actually operational, with log sources ingesting, fields parsing correctly, and the rule producing expected output.
EDR coverage metrics can diverge from operational detection health, particularly when log sources silently stop ingesting.
Where does AI fit in detection engineering today?
AI shows up in detection workflows as a copilot, not a replacement. The SANS 2025 Detection and Response Survey shows mixed confidence in AI/ML tools across respondents.
The practical role is generating candidate rules for human review, suggesting enrichment fields based on log schema, and flagging backlog items correlated with observed activity. Writing detections that perform well in a specific environment still requires judgment that AI doesn't have.
What's the first step to move from ad hoc detection work to a real function?
Inventory every deployed detection rule with its last modified date and observed false positive rate. Then create a minimal backlog with required fields like source, ATT&CK technique mapping, priority score, status, and owner. The inventory and the backlog are the two artifacts that everything else builds on.