Kubernetes AppOps Security Part 5: Pod Security Policies (1/2) – Good Practices
+++ Important:"Pod Security Policies" are deprecated in Kubernetes version 1.21 and will be completely removed in Kubernetes version 1.25. They are succeeded by "Pod Security Standards".+++
The previous two articles in this series (Security Context Part 1, Security Context Part 2) explain why it makes sense to change the default values of certain settings in Kubernetes in order to improve security. The articles show you how these settings in Kubernetes can be optimized using the security context. This security context is generally specified for each container or pod. It can certainly make sense in practice to use a security context: In smaller teams, it is easy to determine which options should be set and then roll these options out successively to all applications. In addition, with the security context, it is easy to test whether applications still work with more restrictive security. However, there are also use cases in which it makes sense to change settings across the cluster: For example, in a new cluster you may want all future applications to start with the least privileges possible. In larger organizations or in situations where a larger group of staff have access to the cluster, this ensures that everyone adheres to the specified security guidelines. Both can be implemented in Kubernetes using Pod Security Policies (PSPs).
You can find out below by way of the following examples how PSPs can be used. If you want to try it yourself in a defined environment, you will find complete examples with instructions in the “cloudogu/k8s-security-demos” repository on GitHub.
Overview of how PSPs work
In contrast to the security context, it can be more complex to use PSPs. Therefore, first let’s provide a brief overview. And then we will consider the individual points in more detail in the succeeding sections.
- The allowed and default values are declared in the “PodSecurityPolicy” Kubernetes resource.
- Assignments between pods and PSPs are made using Role Based Access Control (RBAC). Pods are authorized to use PSPs as follows.
- A “role” allows you to use a PSP.
- A role is bound to service accounts via a “RoleBinding”.
- A service account is assigned to each pod. The pod is executed with the service account’s privileges.
- PSPs are enforced using a special admission controller: Every write request to the Kubernetes API server is passed through the admission controller. The latter checks the pods found in the request against its assigned PSPs. Depending on the setting, values are overwritten or the pod is prevented from starting.
PSP Admission Controller
The admission controller that evaluates the “PodSecurityPolicy” resource must be enabled on the Kubernetes API server. With managed Kubernetes clusters, there is usually an option to enable PSPs. The admission controller is where we find the first pitfalls:
- As the name “admission controller” suggests, its job is to control access to the API server. It is important to note that pods that are already running are not checked again if the admission controller is enabled later or a PSP is changed.
- If the PSP Admission Controller is not enabled, “PodSecurityPolicy” resources in the cluster are not enforced. This can lead to a false sense of security (Kubernetes Security Audit).
- If the PSP Admission Controller is enabled but no PSP is assigned to the pod's service account, the pod will not start.
The latter is the reason why the PSP Admission Controller is disabled by default. Before enabling it on a running cluster, a PSP should therefore be imported and authorized for all service accounts. Otherwise, there is a risk that no more pods will be able to be started. We will devote the following paragraphs to describing how you can do this.
The “PodSecurityPolicy” Resource
There are a large number of settings that you can restrict or set in the PSP. The official documentation and the API description can help you with any specific requirements for specifying cluster-wide settings. The documents include some examples to help you get started with the topic. With the settings, unfortunately, it is not always clear what the default value is and whether the settings are checking values in the pod (and rejecting the pod, if necessary) or setting or overwriting them. We will discuss some specific examples of the various behaviors later in this section. Experience shows that pods are rejected if they apply explicit settings in the security context that do not correspond to the PSP. If no values are specified in the security context, the values from the PSP are set. The UserID, which is specified by the image, may also potentially be overwritten as a result. This behavior effectively realizes cluster-wide default values. The greatest challenge remains selecting which values should be set. As always, you have to make a compromise either in favor of usability or security.
This article shows how the good practices that we have elaborated in the previous articles based on the security context for each container can be transformed into a cluster-wide PSP. Listing 1 shows this PSP (slightly shortened from "cloudogu/k8s-security-demos"). The first line of the listing shows that PSPs in the Kubernetes API are still in the beta stage. We discuss this topic in more detail in a separate section.
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted
annotations:
seccomp.security.alpha.kubernetes.io/defaultProfileName: runtime/default
seccomp.security.alpha.kubernetes.io/allowedProfileNames: runtime/default
# Dasselbe für apparmor.security…
spec:
runAsUser:
rule: MustRunAs
ranges:
- min: 100000
max: 999999
runAsGroup: # wie runAsUser
supplementalGroups: # wie runAsUser
fsGroup: # wie runAsUser
defaultAllowPrivilegeEscalation: false
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
requiredDropCapabilities: [ ALL ]
allowedCapabilities: []
privileged: false
hostIPC: false
hostPID: false
hostNetwork: false
hostPorts: []
allowedHostPaths: []
volumes:
- configMap
- emptyDir
- projected
- secret
- downwardAPI
- persistentVolumeClaim
seLinux:
rule: RunAsAny
Listing 1: PSP that defines cluster-wide settings from previous articles (and others)
- “defaultProfileName”: The default Seccomp profile of the container runtime (for example, Docker) is enabled in the PSP via annotation. Without this annotation, this profile is explicitly disabled in Kubernetes. The profile must also be allowed, or otherwise no more pods can be started with this PSP (“allowedProfileNames”). Note: Before Kubernetes 1.19, Seccomp and AppArmor have not made it into the official API. Therefore, in the PSP, just like in the security context (Security Context part 1), they are still defined via an annotation. This is the usual practice for “alpha” features before they are allocated to a particular Kubernetes API object.
- “runAsUser”, “runAsGroup”: The PSP ensures that containers are executed with high user and group IDs. With high IDs, it is less likely that they are assigned to users on the host system. Existing users could have privileges (for example, to files) on the system that the container would otherwise also have. The settings ensure that pods with no user ID specified in their security context are started with user ID 100000. If a user ID is explicitly specified there in the permissible range (between 100000 and 999999 in Listing 1), this will be adopted unchanged. If a user ID is selected from outside this range, it will lead to the pod being rejected by the admission controller. At the same time, this rule prevents containers from being executed as the “root” (user ID 0) user or group. If you only want to ensure this and do not want to make any further specifications regarding the user ID, you can alternatively use the rule “MustRunAsNonRoot”. This does not change the user ID, it only checks it. If the user ID is 0, the pod is rejected.
- “defaultAllowPrivilegeEscalation”: In order to prevent “root” privileges from being obtained at runtime, the default value for allowing privilege escalation is changed. In addition, * “allowPrivilegeEscalation” ensures that this value cannot be overwritten in pods.
- “readOnlyRootFilesystem”: To ensure that application code cannot be changed during runtime, writing to the container’s file system is prevented by default. In contrast to “PrivilegeEscalation”, there is only one value here that sets the default value and cannot be changed for each pod at the same time.
- “requiredDropCapabilities”: In order to further reduce the attack surface, by default the Linux capabilities that are assigned by container runtimes are removed. It also prevents individual pods from being given certain capabilities (“allowedCapabilities”).
In addition, the PSP ensures that settings whose default values make sense in terms of security cannot be changed in the security context of the pods.
- “apparmor.security…”: Setting these annotations means that the AppArmor profile can no longer be disabled.
- “privileged”: Containers can no longer be started as “privileged”. This would largely defeat the isolation of the container and grant many privileges on the host.
- “hostIPC”, “hostNetwork”, “hostPID”: Containers can no longer be started in the Linux namespace of the host. This would also largely defeat the isolation of the container.
- “hostPorts”: Ensures that a container does not bind directly to the host's ports. This is only possible in the host network namespace in any case, so it is actually redundant here. Redundancy is a positive thing when it comes to security, since it provides yet another layer of defense.
- “volumes”: The permitted volume types are restricted. In addition, directories are prevented from being integrated directly from the host (“allowedHostPaths”). This prevents access to the file system of the host and thus also to the Docker socket. If this is not restricted, the socket can be integrated into a container. This gives the container “root” privileges on the cluster node.
- “seLinux”: The settings for SELinux are not restricted in the example, since the use of AppArmor is assumed here. Since the option is a mandatory field, it must be set in any case. Anyone running Linux distributions working with SELinux instead of AppArmor (for example, RedHat Linux distributions) should make sure this setting is enabled. In contrast to AppArmor, SELinux is already included in the Kubernetes API, so it does not have to be addressed via an annotation.
Additional information concerning individual points can be found in the Kubernetes API documentation. This documentation also includes additional settings, for which, however, there is no single value that improves security in general. Here it is worthwhile to check whether additional settings may make sense for your own use case.
Limitations of PSPs
Not all settings that may be set for each pod can also be specified in a PSP. These have to be set for each pod. Alternatively, you can implement your own admission controller webhooks that change the values in the pods.
- There is no way to disable the inclusion of a token in pods for authentication on the API server (“automountServiceAccountToken” in the pod) by default. Many applications do not need this, and attackers are able to exploit it as the first step in accessing the Kubernetes API.
- In addition, the insertion of service names and ports in the environment variables of the container cannot be disabled by default (“enableServiceLinks” in the pod). This deprecated mechanism for service discovery (also known as “Docker links”) is hardly used anymore. It makes it simple for attackers to choose additional targets.
Enabling via RBAC
After the PSP is defined, it must be enabled. This is done by authorizing a service account for the use of a PSP via RBAC. If a pod is started with this service account, the admission controller determines the associated PSPs and enforces them. The behavior with several PSPs per service account is described in the next article.
The assignment is defined in RBAC either per namespace (using “Role” and “RoleBinding”) or cluster-wide (“ClusterRole” and “ClusterRoleBinding”). Here, too, it is possible and useful to create a specification in YAML, since this can be placed under source code management. However, experience has shown that it is faster to generate YAML. Listing 2 shows how the PSP described in Listing 1 can be enabled cluster-wide by authorizing all service accounts to use the PSP via a group that is available by default in Kubernetes. As long as the use of no other less strict PSPs is permitted in the cluster, the values of the PSP are used for each pod in the cluster. Listing 3 shows the YAML representation:
kubectl create clusterrole psp:restricted \
--verb=use \
--resource=podsecuritypolicy \
--resource-name=restricted \
--dry-run -o yaml \
> role-psp-restricted.yml
kubectl create clusterrolebinding psp:all-serviceaccounts \
--clusterrole=psp:restricted \
--group system:serviceaccounts \
--dry-run -o yaml \
>> role-psp-restricted.yml
Listing 2: Script to generate YAML that enables a PSP cluster-wide via RBAC (Kubernetes 1.15)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: psp:restricted
rules:
- resources:
- podsecuritypolicies
resourceNames:
- restricted
verbs:
- use
apiGroups:
- policy
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: psp:all-serviceaccounts
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: psp:restricted
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:serviceaccounts
Listing 3: ClusterRole and ClusterRoleBinding that enable a PSP cluster-wide via RBAC (Kubernetes 1.16+)
The Kubernetes version must be observed for both listings. The next section describes what this is all about.
The status of PSP in the Kubernetes API
As you can see in Listing 1, the PSPs in the Kubernetes API are still in the beta stage. For many products, beta stage implies they should not be used in a production system. With Kubernetes, however, all this means is that the API can still change. Historically (before Kubernetes 1.19), many Kubernetes resources (such as deployments, for example) were in the beta stage for a long time, though they were still widely used in production systems. PSPs were introduced in Kubernetes version 1.8 in the “extensions” API group. Since Kubernetes 1.10, they have been in the “policy” group (kubernetes/enhancements. Issue #5). As of Kubernetes 1.16, the “extensions” group, which is now deprecated, has been disabled by default (Kubernetes Issue #70672).
In practice, unfortunately, it turns out that enabling via RBAC in versions up to and including Kubernetes 1.15 only works with the “extensions” group. In subsequent versions, only the “policy” group works (see “ClusterRole” in Listing 3). To make matters worse, generation (Listing 2) only works in versions up to and including Kubernetes 1.15. In subsequent versions, an error occurs. However, it is fixed in Kubernetes 1.18 (kubernetes Issue #85314). Therefore, you need to keep the Kubernetes version in mind when enabling PSPs. In addition, if you enable PSPs on Kubernetes versions earlier than 1.16, you must remember to adapt the (cluster) role bindings when migrating to the next version. We will share some tips for finding errors with you in the next article.
On the eve of the release of Kubernetes 1.18, it doesn't look like PSPs will be exiting beta anytime soon. Generally speaking, it has been shown that PSPs are complex to use, which has prompted discussions of various alternatives (kubernetes/enhancements. Issue #5). For the time being, PSPs remain the easiest way to make default settings more secure across the cluster. After all, the mechanism is a built-in Kubernetes tool and does not require any additional infrastructure. A future alternative to PSP could be the “Gatekeeper” policy controller of the Open Policy Agent project. It is a sibling project of Kubernetes within the Cloud Native Computing Foundation (CNCF). It remains to be seen when all PSP features will be able to be implemented with Gatekeeper and whether it will be easier than using PSPs.
Conclusion
This article shows how the good practices for security settings in Kubernetes that we described in previous articles can be enforced cluster-wide using PSPs. In addition, it also demonstrates additional useful settings that can be set with PSP. This is all shown using the general functionality of PSPs: First the PSP resource is defined, then it is enabled cluster-wide via RBAC, which leads to the evaluation of the PSPs by the admission controller. It is important to note that the admission controller must be enabled on the API server or on the managed cluster. Another pitfall that we covered is the group of PSPs in the Kubernetes API. Here there will be differences depending on whether the system is running Kubernetes 1.16 or earlier. In general, PSPs in the Kubernetes API are still in the beta stage. Nevertheless, there is currently no easier way to make Kubernetes default settings more secure on a cluster-wide basis.
The article is limited to the simplest use case: There is only one PSP, and all of the pods can be started with it. In general, this is recommended, because it means that all pods will run with the optimized security settings. However, at runtime, the challenges that we described in the previous articles also apply to PSPs. For many images smaller adjustments need to be made so that containers can be started from these images with settings optimized for security by the PSP. In practice, you may also encounter cases where, for example, applications that were not written by the user cannot be executed with these settings at all, because these applications can only be run with “root” user privileges. The next article in this series will show how individual pods can be started with a less restrictive PSP.
Tags