In the last post, we started our examination of unpatchable vulnerabilities in Kubernetes with a look at CVE-2020-8554, which relates to a possible traffic hijack attack in multi-tenant clusters. For this post, we're moving on to CVE-2020-8561, which chains together multiple steps to provide a vector for server-side request forgery (SSRF) attacks in a Kubernetes cluster.
Before we get into the details of the vulnerability, there are a couple of topics that need to be discussed, we'll go over SSRF vulnerabilities in Kubernetes and Kubernetes profiling because these concepts are important to understanding this CVE.
SSRF vulnerabilities in Kubernetes
SSRF vulnerabilities allow an attacker to convince a system to make network requests on the attacker’s behalf. These types of attacks have been around since at least 1998, when Rain Forest Puppy described SSRF-like attacks on IIS in Phrack 40. SSRF attacks gained prominence with web applications where the attacker could probe the internal network that the server ran on and then gain access to poorly secured management interfaces and cloud metadata services.
For an SSRF attack to be significant, components must exist in different network trust zones. As we discussed in our Kubernetes security fundamentals post about network security, Kubernetes clusters have at least two or three different trust zones, depending on the cluster architecture.
There are two basic classes of SSRF attacks in Kubernetes. The first occurs when an attacker causes a control plane component, typically the API server, to make requests on their behalf. This class of attack is particularly significant in managed Kubernetes environments because it can allow access to the cloud service provider (CSP) environment. An attacker who has significant Kubernetes API permissions can then use that CSP access to reach services that they normally wouldn’t be able to reach.
The second class of SSRF attacks occurs when the attacker causes a node component, such as the kubelet, to make requests on their behalf. This class of attack can be significant when the Kubelet has access to services like cloud metadata systems that might store credentials.
For more details about the various SSRF vulnerabilities in core Kubernetes, check out Kubernetes SSRF on the Container Security Site.
Kubernetes profiling
One of the default settings for all of the main Kubernetes components enables a profiling feature that is used to debug their operation. The --profiling parameter, which is turned on by default, makes API endpoints available under the /debug/pprof URL path. This functionality is provided by an external library named pprof. The library can be added to any Golang program and enables an authorized user to view internal information about the operation of the component. In the case of the Kubernetes API server, enabling profiling also exposes a debug endpoint that an attacker can use to dynamically change the server's logging level for debugging purposes.
More background information about Kubernetes profiling is available in Taking a look at Kubernetes profiling.
CVE-2020-8561 overview and mitigation
Now that we’ve covered SSRF vulnerabilities and Kubernetes profiling, let's look at how this specific vulnerability works. This CVE arises from the combination of an SSRF vector in the Kubernetes API server and the ability to dynamically change log levels by using the profiling endpoints.
The attacker’s goal is to execute an SSRF attack on the Kubernetes API server, which lets the attacker make network requests from the API server’s privileged network position. Kubernetes has a number of vectors that can be used for SSRF. In this case, the attacker uses a validatingwebhookconfigurations object. The attacker then attempts to view the responses from their SSRF probes by increasing the Kubernetes API server’s log level, which allows them to see responses.
A key consideration here is whether this vulnerability is significant in your environment. SSRF vulnerabilities allow attackers to make network requests from the perspective of another system—in this case, the Kubernetes API server. So, if all of your Kubernetes nodes are present in the same network, then this attack is likely to have limited impact. However, if you segregate the control plane nodes from the worker nodes, the attack can be more significant.
In particular, mitigating this vulnerability is important for operators of managed Kubernetes services because the control plane nodes are generally situated in the cloud provider’s network. That network should not be accessible to the cluster operator, regardless of the operator’s privileges within Kubernetes. Even attackers who gain cluster-admin permissions would normally be prevented from reaching it. However, because the control plane runs in a more privileged zone, an attacker might sign up for a publicly available managed Kubernetes service and then create a cluster with the sole intention of exploiting this vulnerability.
CVE-2020-8561 technical details
The user who wants to execute this attack must have valid Kubernetes credentials and elevated privileges, typically cluster-admin.
As we mentioned earlier, it is possible for an attacker with access to the profiling endpoints on the kube-apiserver component to dynamically change the log level, so this attack starts by changing the log level to debug, which will result in all responses to requests to webhooks being captured in the logs.
The attacker then creates a validatingwebhookconfigurations object that they can customize as needed. The general use for these objects is to create policy servers that are used to validate the configuration of Kubernetes objects. The validatingwebhookconfigurations object will point to a policy server that contains rules that enforce requirements, such as security configuration, that are applied to new pods in the cluster.
Within the object definition, the attacker can specify a URL that the API server will call whenever a resource is created that matches the rules that are set in the webhook definition. When the malicious webhook configuration is set up, the attacker can create a matching resource, have the API server make the request, and get some level of information back from the API server as an error message.
On its own, this setup enables the attacker to obtain basic port-scanning style information, as is evident in this proof of concept. (More information about how the port-scanning tool works is available in this blog post.) However, with the log level change made, the attacker will be able to see full responses to their requests.
CVE-2020-8561 involves combining this basic SSRF attack with increasing the log level of the Kubernetes API server so that the attacker can see responses to their probes, which increases the attack's impact.
If you decide to reproduce or test this vulnerability yourself, you should do so only on a dedicated testing cluster with no other workloads present because the process could disrupt the operation of the cluster. Let’s walk through the steps.
First, you make a PUT request to the Kubernetes API to increase the log level of the API server. Typically to do this, you first run kubectl proxy to make the API available locally so that you don't have to pass credentials in the curl request. Then, you can make the PUT request like this:
curl -X PUT http://127.0.0.1:8001/debug/flags/v -d "10"
At this point, the API server will start logging at the debug level.
Next, you need to create a validatingwebhookconfigurations object with the webhooks.clientConfig.url field set to the target host and port that you want to test. The following example sets that parameter to 127.0.0.1:1337 so that when the API server receives a matching request (CREATE or UPDATE on pod objects), that's the URL it will try to contact:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: ssrf-demo-webhook
webhooks:
- name: ssrf-webhook.ssrf-attacker.example.com
rules:
- apiGroups: [""]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["pods"]
clientConfig:
url: "https://127.0.0.1:1337"
caBundle: ""
admissionReviewVersions: ["v1"]
sideEffects: None
timeoutSeconds: 5
failurePolicy: Ignore
Next, you create an object that matches the rules in the validatingwebhookconfigurations definition. In this case, that object is a new pod.
At this point, the API server will try to contact https://127.0.0.1:1337. The response to that request, if one is received, will be visible in the API server's log files. If the API server encounters an error when trying to reach that host and port, the error will appear in the pod events, enabling you to determine if the target service actually responded.
As with all of our unpatchable CVEs, there's no simple code fix for this vulnerability. However, there are some steps that you can take to stop the vulnerability from being exploited. First, you can set the --profiling flag on the API server to false to remove the ability to change the API server's log level. Second, you can configure your organization's network architecture in such a way that attackers gain no benefit from being able to make SSRF requests from the Kubernetes API server.
Conclusion
This vulnerability provides some interesting insight into how Kubernetes objects can be misused to create opportunities for SSRF attacks and how enabling debugging features in production can present risks. In our next post, we'll take a look at CVE-2020-8562, which focuses on bypassing some Kubernetes security controls designed to mitigate more SSRF attacks.