GitHub Cache Poisoning

All Posts

Do you know what happens under the hood of your CI? Without deep understanding, you might be vulnerable to innovative supply chain attacks. This article describes such an attack.

Caching is used to speed up processes. Instead of building or downloading, again and again, a package, it simply and automatically keeps artifacts for reuse. However, caches might be poisoned. For instance, a malicious tool used in a test workflow can poison its cache. Later, another workflow using the same cache might be affected. If this workflow has higher privileges, this is in fact a method of pivoting an attack. In this post, we report on an experiment attack on a GitHub CI pipeline, but a similar logical vulnerability exists in other CI products.

The attack plays out as follows: 

  1. An attacker publishes a malicious tool or a Github Action which is picked up by an unsuspecting workflow in Github.
  2. The workflow is designed with a cache
  3. The malicious payload modifies the cache to include malicious data.
  4. Other workflows that call on this cache from this point might be affected.

In response to our disclosure, GitHub said they don’t have a plan to harden the cache feature against this type of attack. 

We propose mitigation by signing the cache’s content hash value cryptographically and verifying the signature before use. To learn more about the attack and mitigations, keep reading.

Cache Poisoning attack

GitHub Cache

often reuse the same outputs or downloaded dependencies from one run to another (for example, the packages downloaded by package and dependency management tools such as Maven, Gradle, npm, and Yarn. These packages are usually kept in a local cache of downloaded dependencies).

To optimize CI runs, GitHub grants access to a cache resource that can be used across the pipeline via an API. The entries in the cache are a key-value combination, where keys are string-based, and values are files or directories one would like to cache.

Using the action/cache Git action anywhere in the CI will run two steps: one step will take place during the run process when it’s called and the other will take place post workflow (if the run action returned a cache-miss).

  • Run action – is used to search and retrieve the cache. The search is done using the cache key, with the result being either a cache-hit (success, data found in cache) or cache-miss. If found, the files and directories are retrieved from the cache for active use. If the result is cache-miss, the desired files and directories are downloaded as if it was the first time they are called.
  • Post workflow action – used for saving the cache. If the result of the cache call in the run action returns a cache-miss, this action will save the current state of the directories we want to cache with the provided key. This action happens automatically and doesn’t need to be explicitly called.

GitHub Cache Permissions

Access restrictions provide cache isolation and security by creating a logical boundary between different branches (for example: a cache created for the branch Feature-A [with the base main] would not be accessible to a pull request for the branch Feature-B [with the base main]).

The cache action first searches cache hits for a key and restores keys in the branch containing the workflow run. If there are no hits in the current branch, the cache action searches for the key and restores keys in the parent branch and upstream branches.

Access to a cache is scoped by branch (current and parent), meaning access is provided to all workflows across runs of said branch.

Another important note is that GitHub does not allow modifications once entries are pushed – cache entries are read-only records.

GitHub CI Scoping

GitHub scopes allow specifying exactly what type of access is needed for various tasks. GitHub’s CI is scoped in a variety of ways, each with its own set of values and features:

  • Virtual Machine (VM) for each job
  • Job permissions
  • Workflow scopes
  • Workflow runs
  • Git branches
  • Workflow identity tokens
  • and others

GitHub cache scope has been defined in a way that can break some of the other scope restrictions (ex: GitHub cache can affect multiple workflows).

The Attack

We used an example CI that included two workflows. This example shows how an attack can pivot from a low permission workflow to a high permission one.

  • Unit-test workflow running unit-test and code coverage tools. We assume that one of the tools is malicious or vulnerable to remote code execution. The workflow does need to use the action/cache Git action. Any workflow can access the cache.
  • Release workflow builds and releases the application artifact. This workflow uses a cache to optimize using the Golang dependencies.

The unit-test workflow uses a malicious action that adds a cache entry with malicious content by changing a Golang logging library (go.uber.org/zap@v1) to add the string, ‘BAD library’ to the application artifact description.

Next, the release workflow uses this poisoned cache entry. As a result, the malicious code is injected into the built Golang binary and image. The cache remains poisoned until the entry key is discarded (usually triggered by dependency updates). The same poisoned cache will affect any other workflow, run, and child branch using the same cache key.

In the test we performed, we managed to inject the string ‘BAD library’ into the image description:

BAD library

This was in version 0.4.1. Next, we updated the tag and rebuilt the image several times, and observed that ‘Bad library’ remained in the description.

GitHub Disclosure

Github’s response to our disclosure was that the Git action,  action/cache behaves as expected and they have no plans to tighten the scoping of the cache.
Although the GitHub team doesn’t consider this behavior problematic, we advise DevSecOps practitioners to be wary of this attack vector.

Mitigations

  1. Don’t use caches in release or in important workflows.
  2. Run your workflows sequentially, with the trusted workflow running first to make sure your cache is created in a trusted workflow.
  3. Vendor your dependencies – Vendoring in GoLang is a method of ensuring that all 3rd party packages used in the Go project are consistent for everyone who develops for that application. That way, cached dependencies will remain valid for all branches of the project. Other languages might not support this method.
  4. Sign the cache value cryptographically and verify the signature before usage.
    Scribe mitigates such attacks by granularly signing any object or directory in a workflow such as a cache. Before the release, you can use Scribe to validate that only a cache generated by a trusted workflow was used to build the release.

Summary 

In this post, we outlined a cache poisoning attack in CI workflows, which is hidden from the eyes of the DevSecOps team, and discussed mitigations.