Runtime security is where cloud attacks actually get caught
I watched a credential abuse incident cross three AWS accounts last year. The attacker logged in with a stolen access key, moved through Lambda functions, and touched 19 IAM principals before anyone noticed. Our cloud-native application protection platform (CNAPP) dashboard stayed green the entire time, every policy compliant.
Prevention had nothing to flag because there was nothing to prevent. The login succeeded, the permissions were real, the API calls legitimate. The only signal was runtime behavior: a process querying the instance metadata service (IMDS), a binary that shouldn't exist running in the workload, an outbound connection no service would make.
In brief:
- Valid credential abuse accounts for 35% of cloud incidents. Prevention controls stop unauthorized access, but they can't tell a legitimate user from an attacker holding that user's credentials.
- 60% of containers live 60 seconds or less, and cloud attacks can complete in about 10 minutes. Detection frameworks built around multi-day dwell times can't close that gap without runtime.
- Agentless CNAPP tools can't observe running processes, syscalls, or in-memory behavior. Wiz describes agentless scanning as operating outside the workload, so runtime detection needs a sensor in the data path.
- Operating the runtime layer in production is the hard part: telemetry volume, agent lifecycle, ephemeral workloads, and alert ownership.
Runtime security requires operational commitment
That gap is a category problem, not a tuning problem. The textbook definition of runtime security, monitoring workloads during execution to catch malicious behavior, is accurate but makes it sound like a feature you toggle on. It is closer to an operating commitment.
In practice it means maintaining kernel-level eBPF sensors across fleets of ephemeral workloads, then owning the rules, telemetry, and alerts they produce. It catches process execution, syscall patterns, and outbound connections no configuration scan sees, which is where cloud hygiene ends and detection starts.
Prevention can't catch a login
Prevention is worth keeping, but it is built for a fading threat model. Misconfiguration-based access is declining while leaked credentials rise as an initial access path, per Google Cloud Threat Horizons. Malware-free activity made up 79% of detections in 2024, leaving signature-based prevention little to act on.
Investment in one vector pushes attackers to another, and the destination is credential abuse. Criminal and nation-state actors now run entire attacks as legitimate users, manipulating whatever the compromised identity can touch, per the Microsoft Digital Defense Report. When the attacker is the user, the firewall, the WAF, the IAM policy, and MFA have already done their job.
Runtime is where the kill chain becomes behavior
If the attacker looks like a user at the control plane, the attack only shows up as behavior. The MITRE ATT&CK Cloud Matrix maps where that behavior lives, and several of the techniques that matter fire below the API layer CloudTrail records:
- T1552.005, the metadata query that hands an attacker IAM credentials, happens outside CloudTrail even though the later credential use is logged
- T1611, escape to host, fires at the syscall layer, where API logging never sees it
- T1610, deploy container, may log a pod-creation event while the malicious payload running inside after deployment leaves no control-plane trace
- T1496, resource hijacking, surfaces as workload behavior such as a cryptominer that posture tools cannot see while the container runs
Sysdig's SCARLETEEL campaign chained exactly this: a compromised container queried IMDS for credentials, every log-dependent detection went blind there, and the runtime sensor saw the query first. High-severity cloud alerts rose 235% in 2024, a large share from activity posture tools miss, per Unit 42.
The runtime layer is the one most programs underbuild
Knowing runtime is where you catch the attack doesn't make it the layer teams build. In every selection I've run, it is the widest gap between what gets bought and what gets operated: syscall telemetry that overwhelms downstream pipelines, agent lifecycle across churning clusters, and a forensics pipeline nothing ships pre-built.
The staffing math compounds it. Running this well takes Linux internals, cloud platform engineering, and detection engineering in one team, a combination that stays scarce. Teams that can't staff it can use outside support, a managed detection and response (MDR) provider or a platform that works the alert stream, which is the case I make in my MDR evaluation.
Runtime detection needs a sensor in the data path
Whatever you buy or outsource, the architecture decides what it catches. By Wiz's own description, agentless tools run outside the workload and can't block a process or quarantine a file, so runtime detection needs a sensor in the data path, usually eBPF that loads without a reboot. The tools differ in approach:
- Falco watches syscalls enriched with Kubernetes metadata and fires rules at the moment of execution; it graduated from CNCF in February 2024
- Tetragon adds in-kernel enforcement, issuing a SIGKILL when a policy violation occurs before the syscall returns
- RAD Security builds behavioral fingerprints of each workload and flags deviation from a known-good baseline
- Wiz Defend pairs eBPF sensors with cloud audit logs and its security graph for correlation-based detection
The distinction a buyer must hold onto sits here. Mitiga and Exaforce lean on cloud, SaaS, and identity telemetry rather than kernel-level workload sensors. A vendor that ingests CloudTrail and calls it runtime detection is selling something different from one watching syscalls inside the container.
Before I buy a runtime tool, I make it catch a live attack
A data sheet won't tell you which side of that line a product sits on. I open every demo with one scenario: a container gets exploited, the attacker queries the instance metadata service for IAM credentials, installs tooling, and moves laterally on the stolen token. Vendors who can't walk it through on a live workload get cut.
A cloud security posture management (CSPM) finding about the IMDS hop limit is not detection, and vendors who offered that got cut too. I need the process tree, the syscall that fired the rule, the Kubernetes context, and the response action available. If it can't produce those from a live workload, it isn't watching it.
Decide who owns the alert, then fund the runtime layer
A tool that detects is only half of it, because someone has to own what it finds. When a runtime alert fires from a Kubernetes workload at 2 AM, the SOC analyst usually lacks the platform context to triage it, and the cloud team that has it isn't staffed for response. Define the escalation path and the platform-versus-SOC boundary first, where alert triage ownership and identity-based attacks already break down.
Get that boundary right, and the runtime layer is where the next dollar of cloud security budget goes. Posture is hygiene that shrinks the attack surface, while runtime detection catches the attacker who walked in with a valid key. Before I sign a PO for a tool that claims cloud detection, I make it prove it can see inside a running workload after the attacker authenticates. Everything else is posture management with a different label.
Frequently asked questions about runtime security
These are the questions I hear most from teams evaluating runtime detection for cloud workloads.
What runtime signals does CSPM miss in Kubernetes?
Runtime security detects behavior inside running workloads: process execution, syscalls, and credential theft through the instance metadata service. CSPM scans configuration state, so a cryptominer in a compliant container is invisible to it. Runtime sensors catch it: the activity is anomalous, not the configuration.
When should teams deploy eBPF for container runtime detection?
Deploy it when you need detection inside containers, not checks on configuration. eBPF runs isolated programs in the kernel at syscall hook points. When a process makes a syscall, the program captures it, adds Kubernetes metadata, and checks it against rules. eBPF programs are verified safe at load time and deploy without a reboot.
What are the limits of agentless CNAPP for running containers?
Agentless tools scan container images, find vulnerabilities, and assess cloud configuration. They cannot observe running processes or syscalls inside a container, and they miss in-memory behavior. Wiz's own documentation recommends pairing agentless scanning with eBPF sensors, so workload-level detection needs a sensor in the data path.
How do teams handle Kubernetes runtime alerts overnight?
Both the cloud platform team and the SOC have to be involved, with the boundary set before you deploy. Platform teams tell benign deployment artifacts from anomalous behavior, and the SOC owns investigation and response. Without a documented escalation path, runtime alerts get ignored or filed as false tickets for normal operations.
How do you investigate a container that lived 60 seconds?
You automate evidence capture at detection time. Container checkpointing through the kubelet API freezes a copy for analysis, and a Falco, Falcosidekick, and Argo pipeline can trigger it when a rule fires. None of this is on by default, so if you haven't built it before an incident response event, the container is gone before you investigate.
What are the runtime sensor architecture differences across CDR tools?
Sysdig, Aqua, Falco, and RAD Security document kernel-level eBPF sensors. Wiz Defend pairs those sensors with cloud audit logs and graph-based correlation. Mitiga and Exaforce emphasize cloud, SaaS, and identity telemetry, not kernel-level workload sensors, a different model from products watching syscalls inside containers.