Kubernetes security fundamentals: Networking

In the previous post in this series, we took a look at admission control and its role in securing Kubernetes environments. Now, it's time to move on to another core Kubernetes security topic: network security.

This can be quite a complex topic, as Kubernetes networking can cover a wide range of areas that vary depending on the environment you're running, the configuration of your clusters, and how you use Kubernetes. For the purposes of this blog post, we'll keep things relatively simple by looking at a standard Kubernetes cluster with an overlay-based network, using kube-proxy for service access.

Even with all the variety possible in Kubernetes networking, one key point remains constant: in a Kubernetes cluster, all pods are able to communicate with all other pods by default. This is great for application management—it's this property, combined with service discovery, that allows applications to be deployed across large numbers of nodes while still being able to operate. However, the ability of all pods in a cluster to communicate with each other has some consequences for network security, as it means we generally start with a flat pod network that has no restrictions.

Network trust zones

When talking about different Kubernetes distributions it's helpful to draw a distinction between managed and unmanaged options. With a managed Kubernetes distribution like AKS, EKS, or GKE, the cloud service provider manages the cluster control plane and provides various add-on services. This removes some complexity for the cluster operator but can limit flexibility when it comes to cluster configuration.

The other option is unmanaged Kubernetes distributions, where the cluster operator manages all the nodes in the cluster. These distributions can be installed either with on-premises hardware or in cloud-based virtual machines, such as EC2 instances.

Network trust zones in an unmanaged Kubernetes cluster

From a networking perspective, there are effectively two network trust zones in an unmanaged cluster. The first is the node network, where the control plane and worker nodes reside. Typically, there will be network services here, and these nodes will have access to any cloud metadata services that they need to operate.

The other network zone is the pod network, where the workloads are located. Depending on how many applications are installed in the cluster, you might want to have multiple separate trust zones in your pod network—for example, for different team's applications.

Network trust zones in a managed Kubernetes cluster

In managed Kubernetes, there's effectively a split in the node network. The control plane nodes are hosted by the cloud service provider, and while the API server will typically be available to cluster operators and workloads, the underlying network for the host running the API server should not be available. There are still security concerns with this setup, but in managed distributions, cluster operators simply don’t have any control over these configurations.

Introduction to CNI

Plugins are another key piece of Kubernetes architecture when it comes to network security. By default, Kubernetes does not provide any networking functionality, so in order to have a functional cluster, operators use network plugins that are written to comply with an interface specification known as the Container Network Interface (CNI). There is a wide range of network plugins available, and the way they work can be quite varied. For some hands-on technical details on how a basic plugin is implemented, see this blog post.

For the purposes of network security what's important to us is whether the plugin supports Kubernetes network policy enforcement and whether it supports additional extensions to network policies' functionality. Some plugins like kindnet and flannel don't support network policies at all, whereas other popular network plugins like Cilium and Calico support the in-built Kubernetes network policies as well as their own extensions, which facilitate more complex network security configurations.

Managing network access in Kubernetes

Network policies allow cluster operators to put traffic restrictions in place based on the Kubernetes entities in question or, where necessary, specific IP address ranges. Typically, we’ll use entities for intra-cluster traffic, since IP addresses within the cluster can change quite frequently, so we can't rely on them being fixed.

There are two different types of traffic restriction that we need to look at: ingress and egress. For each of these, a pod is unrestricted unless there is a network policy that applies to it. As soon as there is any ingress or egress policy that applies to a specific pod, only the type of access specified by the policy is allowed.

To make this a bit clearer, let's use a simple demonstration environment (detailed in the appendix below) consisting of a three-node kind cluster with the Calico CNI deployed to enforce our network policies. Then, we'll deploy a basic server application with two web server pods, a database pod in one Kubernetes namespace, and a client pod in another namespace. With this setup, we can test to make sure that a workload in another namespace can connect to our deployed application

To test connectivity to our application, we'll create a new pod in the default namespace.

kubectl run -n default -it client --image=raesene/alpine-containertools -- /bin/bash

Then, we’ll use curl to confirm we can access the pod via the service.

curl web-app.netpol-demo.svc.cluster.local

This command should respond with the default nginx web page. Once we've confirmed that the application is working, we can apply our first network policy.

# 7. Default deny all ingress traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: netpol-demo
spec:
  podSelector: {}
  policyTypes:
  - Ingress

With this policy applied, trying the same curl command will timeout as it's no longer possible to connect from the default namespace where our client pod is located to the web-app deployment.

One question you might have is, "How did Calico stop that traffic?" How Kubernetes network plugins operate at a low level will differ, but in the case of Calico we can look at the iptables rules on the cluster nodes to see that it's added a new rule that refers to our network policy using something like iptables-differ. There should be a rule that looks something like the below.

+ -A cali-pi-_S8ASFUGmFkVQ9FwaawE -m comment --comment "cali:x_7bZcXsQE7OUSr9" -m comment --comment "Policy netpol-demo/knp.default.default-deny-ingress ingress"

Obviously, in a real-world application we don't want to block all ingress traffic, so we'd need to add additional rules to allow other workloads in the cluster to reach it. As an example, the rule below allows pods in any namespace with a label of purpose: frontend to communicate with pods in the netpol-demo namespace that have a label of app: web on port 80/TCP

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: web-allow-frontend
  namespace: netpol-demo
spec:
  podSelector:
    matchLabels:
      app: web
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          purpose: frontend
    ports:
    - protocol: TCP
      port: 80

It's important to note that network policies also apply to traffic inside a given namespace, so we need to explicitly allow access in our policies. In our example application, we can set some policies to allow the web-app application to access the db.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: db-policy
  namespace: netpol-demo
spec:
  podSelector:
    matchLabels:
      app: db
  policyTypes:
  - Ingress    
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: web    
    ports:
    - protocol: TCP
      port: 5432

Here, we're allowing access to pods with the label app: db from pods with the label app: web. Note that it would be possible to put the database in a different Kubernetes namespace and still have this kind of network policy allow access to it from the web-app pods.

Securing the cluster network

With some idea of how network policies operate, the next question is naturally, "How do I use these to secure my network?" While the exact answer depends on how you use your cluster, there are some general principles to keep in mind.

It’s a good idea to start managing network access while you're developing or deploying your application. If you develop network policies alongside the application manifests, you can test both before they’re deployed to production, reducing the risk that a network policy causes problems once it's deployed.

Ideally, start from a "default deny" position for all network access and then work to allow what's needed. The reverse approach—allowing all access and then determining what to deny—is very difficult to manage and makes it hard to know definitively what access should be blocked.

It's very important to consider both egress and ingress rules. Particularly if your Kubernetes cluster runs in a cloud-based environment, allowing unrestricted egress from pods could allow attackers to escalate their privileges from one compromised container to the entire cluster, or even to a cloud account supporting the cluster by attacking the instance metadata service.

Another important consideration is the use of host networking in Kubernetes. A workload using host networking bypasses all standard Kubernetes network policies as shown here. That’s why it's important to restrict access to host networking or implement additional software like Cilium's host firewall feature, which can apply restrictions at the host level.

Conclusion

Network security is an important part of securing a Kubernetes environment and one that doesn't have a strong set of defaults, meaning that cluster operators do have to consider what controls they want to put in place, to prevent unauthorised network access.

In the next installment of this series, we’ll delve into how Kubernetes architecture handles public key infrastructure (PKI) and where attackers might be able to abuse those facilities.

Appendix - Setting up a demonstration environment

For this demonstration, we're using kind, but we need to change some of its default configuration, as it doesn't support network policies out of the box. First, we'll start a cluster with three nodes and no default CNI setup.

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  disableDefaultCNI: true
nodes:
- role: control-plane
- role: worker
- role: worker

With this configuration saved as multi-node-kind.yaml, we can create the cluster with kind create cluster --name=multi-node --config=multi-node-kind.yaml. Once the cluster is up and running we can setup the Calico CNI to provide network policy support, following their quickstart guide.

Once we have the basic cluster in place, we can deploy a sample workload. This consists of a web application based on nginx and a postgres database container.

# 1. Demo namespace
apiVersion: v1
kind: Namespace
metadata:
  name: netpol-demo

---
# 2. Example web application deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: netpol-demo
spec:
  selector:
    matchLabels:
      app: web
  replicas: 2
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

---
# 3. Web application service
apiVersion: v1
kind: Service
metadata:
  name: web-app
  namespace: netpol-demo
spec:
  selector:
    app: web
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
  type: ClusterIP

---
# 4. Database secrets
apiVersion: v1
kind: Secret
metadata:
  name: postgres-secrets
  namespace: netpol-demo
type: Opaque
data:
  postgres-password: cG9zdGdyZXNwYXNzMTIzNA==  # base64 encoded 'postgrespass1234'

---
# 5. Database deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: database
  namespace: netpol-demo
spec:
  selector:
    matchLabels:
      app: db
  replicas: 1
  template:
    metadata:
      labels:
        app: db
    spec:
      containers:
      - name: postgres
        image: postgres:13
        ports:
        - containerPort: 5432
        env:
        - name: POSTGRES_DB
          value: "testdb"
        - name: POSTGRES_USER
          value: "postgres"
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secrets
              key: postgres-password
        volumeMounts:
        - name: postgres-storage
          mountPath: /var/lib/postgresql/data
      volumes:
      - name: postgres-storage
        emptyDir: {}

---
# 6. Database service
apiVersion: v1
kind: Service
metadata:
  name: postgres
  namespace: netpol-demo
spec:
  selector:
    app: db
  ports:
    - protocol: TCP
      port: 5432
      targetPort: 5432
  type: ClusterIP

Kubernetes security fundamentals: Networking

Network trust zones

Introduction to CNI

Managing network access in Kubernetes

Securing the cluster network

Conclusion

Appendix - Setting up a demonstration environment

Did you find this article helpful?

Related Content

work with us

Network trust zones

Introduction to CNI

Managing network access in Kubernetes

Securing the cluster network

Conclusion

Appendix - Setting up a demonstration environment

Did you find this article helpful?

Subscribe to the Datadog Security Digest

Thank you for subscribing!

Related Content

work with us