A runtime security approach to detecting supply chain attacks

In September 2025, the npm ecosystem was hit by yet another supply chain attack. This time, an infostealer with worm characteristics, named Shai-Hulud after the giant sandworm of the Dune universe, found its way into compromising 500+ packages. As these packages could be directly installed or exist as an indirect dependency of a project, the impact of the attack was massive. Security researchers have identified and confirmed as of the initial access vectors, the one related to a previous compromise on s1ngularity/nx, which began via a pwn request with a subsequent GitHub and npm token exfiltration. By now, it is evident that there is the need to further strengthen how packages are both published and consumed.

This attack underscores the fact that CI/CD pipelines remain an overlooked target for credential theft and propagation, making them very attractive for attackers. CI/CD systems have access to secrets, including API keys, cloud credentials, and deployment tokens. Attackers can leverage these secrets to gain unauthorized access to internal networks, move laterally between systems, or tamper with software releases to impact downstream users. In addition, CI/CD security is sometimes overlooked by developers, who may inadvertently give attackers opportunities to gain code execution in their environments in a relatively easy way. Often, this happened via a pwn request, a technique that’s been used in the wild for many years now (despite the fact that GitHub blogged about it back in 2021), but attackers have also used other injection attack techniques.

The aim of this blog post is to show how eBPF-based sensors, especially in self-hosted CI/CDs infrastructures, can help teams detect and defend against these threats in greater depth than other methods, by analyzing the behavior of suspected malicious packages or known attack patterns at runtime.

Traditional tooling is helpful but not enough

Traditional protection relies on static scanning and maintaining a database to block specific, known malicious versions of packages and dependencies. However, this approach has two key limitations. First, it fails to catch new, unknown threats. Second, its heuristic-based rules often create both false positives and false negatives, making them inherently bypassable by sophisticated attackers.

Another approach to protection involves the publishing mechanism itself. After multiple incidents in 2025, npm moved fast to introduce trusted publishing. This feature uses OIDC to establish a trust relationship between a specific workflow—for example, in your GitHub or Gitlab—and the npm registry. The OIDC provider for your CI/CD service generates a short-lived token with a precise set of claims, which can be checked by the registry to ensure that only a trusted workflow can publish the package. This effectively eliminates the need to store long-term credentials as secrets, which attackers often exploit—for example, stolen long-term tokens enabled wide compromise in the Shai-Hulud campaign.

However, even with these precautions in place, malicious packages could still be published in public registries and unknowingly consumed by developers or CI/CD pipelines. You can augment your defense-in-depth strategy by preparing to deal with this kind of attack from yet another point of view: securing your CI/CD workloads themselves.

Self-hosted means self-secured

Many teams use self-hosted runners in CI/CD environments for a multitude of reasons, including:

Having full control over the hardware, operating system and software tools being installed on the runner
Using already in-use compute instances, not necessarily ephemeral ones (even though this can open up some security considerations)
Having dedicated runners for resource-intensive CI/CD jobs, which can speed up the compilation time of your project

Many CI/CD services, including GitHub Actions, offer the possibility to attach your own computing instances. Sometimes, they also offer open source tooling to attach an entire Kubernetes cluster and let you autoscale your runners with an operator, such as the Kubernetes controller for GitHub Actions self-hosted runners.

This setup also makes users responsible for the security of their runners, which means it’s possible to install a runtime security tool like Datadog Workload Protection to help you secure your CI/CD pipelines.

Diving into the Shai-Hulud payload

Before understanding how to detect this threat, we have to take a closer look at its payload. There has been a lot of analysis on this, so a small recap here will suffice.

As is common in software supply chain malware, especially in npm, Shai-Hulud uses the post-install script feature of the package.json file. This ensures the malicious code is executed immediately upon the package's installation, often without the consumer’s knowledge.

The core of the payload focuses on credential theft and lateral movement. First, Shai-Hulud downloads and executes a legitimate open source tool, TruffleHog, and uses it to scan the host system for API keys, hardcoded secrets, and cloud credentials.

Once the malware discovers credentials, it validates their legitimacy and uses them to establish persistence and spread. The final act of data exfiltration involves transmitting all harvested sensitive data to a hardcoded webhook endpoint.

Crucially, Shai-Hulud earns its worm characteristics through a final, automated step. If the malware successfully discovers additional npm or GitHub publishing credentials, it immediately uses them to create and publish a new version of a package it has access to, placing the exact same payload in the post-install script to propagate the infection to every downstream consumer that updates or installs the infected library.

A generalized approach to detection

Datadog Workload Protection can detect these kinds of threats by using execution contexts. The goal of an execution context is to group together events that are part of the same compromise, without hardcoding exact sequences of what we might expect. An example of execution contexts could be:

An interactive shell
A Kubernetes user session
A container, or a service (i.e., a group of processes running in the same cgroup)
Any other detectable events that could indicate malware

Knowing that malware can start executing its payload by means of a simple npm install command, we might want to mark this as the start of an execution context. But what does this actually mean in practice?

The Datadog Agent allows users to write expressions for rules using Datadog’s Security Language (SECL). This language enables Datadog to collect data about the system where the Agent is running, giving Datadog the ability to filter what activity to collect. For example, rules written in SECL collect information on processes, file systems, networks, and more. The language also supports variables, which are SECL expressions that determine when a specific condition is met. Together, these components provide the foundational constructs that allow Workload Protection to define execution contexts.

As an example, the following is the Agent SECL expression used to mark the start of a specific execution context: malicious package installation.

exec.file.name in [~"node", ~"npm"] &&
(
    process.args =~ "* install *" ||
    process.args =~ "* add *" ||
    process.args =~ "* i *" ||
    process.args =~ "* in *" ||
    process.args =~ "* ins *" ||
    process.args =~ "* inst *" ||
    process.args =~ "* insta *" ||
    process.args =~ "* instal *" ||
    process.args =~ "* isnt *" ||
    process.args =~ "* isnta *" ||
    process.args =~ "* isntal *" ||
    process.args =~ "* isntall *"
) &&
not(process.args =~ "*-e *") &&
${process.correlation_key} in [
    "",
    ~"cgroup_*",
    ~"auid_*",
    ~"service_*",
    ~"service_new_cgroup_*",
    ~"interactive_shell_*",
    ~"k8s_session_*"
]

First, this expression uses the exec event, which is triggered every time a process executes a binary. Whenever node or npm are executed, the expression checks the process’s arguments to determine whether an install operation was issued. This is the main part of the condition that helps us use this event as the start of an execution context. Next, the expression checks whether any process correlation key variables involving the `node` or `npm` commands have already been set. If the event was generated under another execution context, the one we are about to set will take precedence, as it involves potential malware. The condition explicitly states on which execution contexts to take precedence on, which are all the ones that were defined in the current ruleset.

When the condition is met, the Agent evaluates the actions that were defined in the rule and sets (or resets) the process correlation key for the execution context, as seen below:

actions:
  - filter: ${process.correlation_key} != ""
    set:
      name: parent_correlation_keys
      default_value: ''
      expression: ${process.correlation_key}
      append: true
      scope: process
      inherited: true
  - set:
      name: correlation_key
      default_value: ''
      expression: '"package_install_${builtins.uuid4}"'
      scope: process
      inherited: true

This Agent rule helps us gain visibility into the details necessary to identify malicious activity. First, the existing parent correlation key is preserved. Then a new package_install variable is set with a randomized correlation key using ${builtins.uuid4}. This new key is inheritable by any process under the npm install process tree. If any process in this lineage generates another security alert, its correlation key is sent to the backend, which allows Datadog to group all related events, giving you a clearer picture of the story of a compromise.

On the backend side, the detection rule looks like this:

queries:
  - query: '@process.variables.correlation_key:(package_install*) tactic:*'
    groupByFields:
      - '@cgroup.id'
      - '@process.variables.correlation_key'
    distinctFields:
      - tactic
    name: tactics_on_package_install
    hasOptionalGroupByFields: false
cases:
  - name: malicious_package_installation
    status: high
    notifications: []
    condition: tactics_on_package_install > 2

As you can see, the backend rule simply counts the different tactics in use by events generated by processes that have the same correlation key. There are a few important things to consider with this detection:

Since the Shai-Hulud malware is prone to generate alerts regarding different tactics, such as credential access, discovery, lateral movement, execution, and exfiltration, we simply rely on the default out-of-the-box ruleset to generate a security signal. This makes the rule general enough to detect not only Shai-Hulud but also any kind of malware whose payload resides in a post-install script, assuming the individual agent events are well-prepared and maintained.
This kind of detection elevates simple atomic detections, because there is priori knowledge that this execution context reflects patterns found in common malware. For example, detecting that Trufflehog is being executed is very different from determining that it was executed in the context of an npm package installation, especially without hardcoding the parent as a node process. The same goes for contacting the GitHub API, which Shai-Hulud attempts to do in order to search for other secrets if it does not find a GitHub token.

Simulating the Shai-Hulud payload with this detection in Datadog Workload Protection results in the Investigation Graph below:

Sample process tree of a supply-chain attack involving a malicious npm package harvesting credentials and exfiltrating them using the GitHub API (click to enlarge)

This detection demonstrates how execution contexts convert noisy, isolated alerts into coherent compromise stories. Applying this model broadly across CI/CD workloads strengthens defense in depth. By embedding runtime behavioral analytics directly in CI/CD infrastructure, Datadog Workload Protection helps teams detect attacks that static scanners might miss.