Important OpenShift changes to Pod Security Standards
In Kubernetes 1.25, the PodSecurityPolicy, deprecated since v1.21, will no longer be served. The PodSecurityPolicy is replaced by a new admission controller (KEP-2579: Pod Security Admission Control), allowing cluster admins to enforce the pod security standards with namespace labels. OpenShift security context constraints (SCC) APIs are also changing to address these needs. The Red Hat OpenShift product management team wants to ensure our customers and partners know how to move forward.
Read the Deprecated API Migration Guide on the Kubernetes website for more information about what APIs are deprecated and will no longer be served for each release version. Also, check out our latest blog post regarding how to update your Kubernetes operators for API changes. Lastly, review the official Pod Security Admission in OpenShift 4.11 blog post
With the introduction of a new built-in admission controller that enforces the Pod Security Standards, namespaces and pods can be defined with three different policies: Privileged, Baseline and Restricted. Therefore, pods not configured according to the enforced security standards defined globally, or on the namespace level, will not be admitted and will not run.
- The new SCC policies [restricted-v2, nonroot-v2, and hostnetwork-v2] are introduced with new criteria to admit workloads according to the Pod Security Standards. Permissions to use the restricted-v2 SCC are granted to all users.
- The new (v2) policy versions drop ALL capabilities, while the previous versions [v1] only drop a subset.
v1 vs. v2 SCC policies
- V2 does not permit allowPrivilegeEscalation=true
- Empty or false is compatible with v1 SCC and therefore works on OCP versions < 4.11
- V2 requires you to leave the dropped capabilities empty, set it to ALL, or add only NET_BIND_SERVICE
- By being accepted as v2 the SCC will always drop ALL. V1 only dropped KILL, MKNOD, SETUID, SETGID capabilities.
- V2 still allows explicitly adding the NET_BIND_SERVICE capability
- V2 requires you to either leave SeccompProfile empty or set it to runtime/default
- Empty is compatible with v1 and works on OCP versions < 4.11
TL;DR Reference the OCP 4.11 documentation for Security Context Constraints(SCC) as well as the release notes related to this subject:
What is impacted in 4.11?
Workloads that qualified for the restricted SCC previously (OCP <= v4.10), but that now do not qualify for restricted-v2 (4.11+).
Access to the restricted SCC policy is no longer granted to all users by default in new 4.11 clusters. This means that workloads previously using the restricted SCC may not have access to it in v4.11. If the workload does not qualify for another SCC (such as the restricted-v2 policy), the workload will not be admitted onto the cluster.
This scenario includes Pod(s) that use
spec.containers[*].securityContext.allowPrivilegeEscalation = trueAND DONOT use other security context configurations that would disqualify the workload to run as restricted in Openshift 4.10 or earlier versions (i.e. runAsUser).
If your workload is categorized into the restricted-v2, nonroot-v2 or hostnetwork-v2 SCC it might fail, or be prohibited from functioning properly, because of attempting to use a capability that is now dropped.
Workloads can be categorized into
restricted-v2, nonroot-v2 or hostnetwork-v2 regardless of what capabilities they explicitly drop. Therefore, any workload that otherwise qualifies as restricted-v2, nonroot-v2 or hostnetwork-v2 will automatically have ALL capabilities dropped.
This scenario includes Pod(s) that require capabilities other than
KILL, MKNOD, SETUID, SETGID, NET_BIND_SERVICE to run and DO NOT describe those capabilities within spec.containers[*].securityContext.capabilities.
These other capabilities are now automatically dropped from OCP 4.11+, when qualified to the restricted-v2 policies, but were not dropped from previous versions (OCP <= 4.10) when the workloads were qualified to restricted (v1) policies. Therefore, the workload will not have the required capabilities to function.
IMPORTANT: To ensure your product works properly, we always recommend you test and validate [and certify when appropriate] on each OpenShift version that you claim to support. For the specific changes discussed in this article, please ensure you test and validate specifically against NEW OpenShift installations. If you do not do this, you may not see the full scope of failures because your workloads may still qualify to the restricted (v1) SCC policies.
How to verify that the product is impacted
Validate whether or not your product falls into one of the following scenarios:
Your product has one or more workloadsusing
spec.containers[*].securityContext.allowPrivilegeEscalation = true
and the workloads are NOT using other security context configurations that would disqualify the workload from running restricted in Openshift 4.10 and previous versions (i.e. runAsUser).
Your product has one or more workloadsrequiring capabilities other than
KILL, MKNOD, SETUID, SETGID, NET_BIND_SERVICE which are not described in
spec.containers[*].securityContext.capabilities. These other capabilities will be automatically dropped in 4.11 (v2 SCC) but were not dropped in 4.10 (v1 SCC).
If the first scenario above is TRUE, your product might run as restricted in OpenShift versions up to v4.10, but now [in OCP 4.11+] it will not qualify for v2 SCC. Therefore, OCP/OLM users might be unable to run these workloads on their cluster, or for SCENARIO 2 your workload will not have the required permissions to operate properly.
If your product is impacted by one or both of the above scenarios
If or when possible, re-develop your product so any pod(s) and containers created do not require escalating privileges. This is the best approach unless your workload requires escalating permissions in order to function properly.
# Do not use SeccompProfile if your project must work on
# old k8s versions < 1.19 and Openshift < 4.11
- name: my-container
Ensure that the ServiceAccount managing the workloads has the correct privileges to use the SCCs that allow it to run the workloads it is responsible for.
Note: If your product is an Operator integrated with OLM you may define this into the operator metadata bundle via the RBAC specified into the CSV which is used for the ServiceAccount managing the workloads.
Therefore, if your application fails in SCENARIO 2(i.e. it requires scaling privileges because it requires specific capabilities to run and perform operations) then make sure to explicitly write out the required capabilities in the security context of your workload.
- name: my-container
If your workloads should run as restricted-v2 (i.e. it is not configured and/or has no reason to escalate permissions): test your solutions against a NEW Openshift 4.11 cluster and ensure that all workloads are admitted via the restricted-v2 SCC.
The most straightforward way to ensure your workloads can work on a restricted namespace (coming in 4.12) is by labeling the namespaces where they should run by enforcing the restricted policy and verifying if they are admit and are successfully running. You may also check whether they are assigned the restricted-v2 SCC in 4.11. We recommend to validate and ensure the desired behavior by doing functionality end to end testing.
Tip: If you need to check or verify the SCC a workload is using, you can find the value in the annotation openshift.io/scc:
What is planned for Openshift 4.12? [coming December 2022]
1) By the global Pod Security Admission configuration, all namespaces will run in the “restricted” enforcement mode. This Pod Security Admission (PSA) level aligns with the restricted-v2 SCC, so if your workload is categorized into restricted-v2, it will be able to run in a namespace that enforces the PSA as restricted.
2) In cases where the workload has access to higher SCC privileges (via its Service Account) OpenShift will label namespaces, via the new label synchronization component, to synchronize the PSA enforcement level in order to keep workloads running.
- The automatic enforcement labeling logic will NOT be applied to namespaces that are managed as part of the Openshift system or those which are prefixed with "openshift-*" and do not have an OLM Operator (CSV) installed.
- Cluster admins can enforce the PSA level for a namespace to be restricted.
What is impacted in 4.12?
The changes planned in 4.12 are primarily to enforce the restricted policy, by default, to namespaces prefixed with “`openshift-”. In addition, the changes are meant to prohibit workloads that cannot be qualified to the restricted-v2 SCC from being admitted and run on those namespaces.
Ideally, on OpenShift, no one distributing solutions integrated and managed by Operator Lifecycle Manager will be affected. However, this will not be the case if your workload was created in a namespace that has no CSV installed OR within a namespace that is a part of the OpenShift system.
Your product may be affected in the following scenarios:
Your product is distributed for use on vanilla Kubernetes or other Kubernetes distributions outside of OpenShift:
- Label synchronization is an OpenShift feature and will not be present within other Kubernetes distributions. You may need to explicitly state Pod Security admission requirements for your workloads. If the workload does not meet requirements, it will fail to run.
Your product creates workloads which cannot be accepted as restricted-2. These workloads will fail to run in OCP payload namespaces[prefixed with "openshift-*"] and do not have an OLM operator (with CSV) installed.
- If your product is not an operator integrated and managed by OLM, be aware that ANY workload created by the product, or if OCP users install your solution on namespaces prefixed with "openshift-*", they must be qualified to run with restricted-v2. Otherwise, your workloads will not be admitted and consequently will fail to run. In this case, follow the best practices recommendations: “My project requires escalated permissions!”
If your project has workloads that can run as restricted-v2 but are not configured accordingly or those that require escalated permissions.
- Your workload’s namespace might be labeled as "Privileged" or "Baseline" by the label synchronization controller. This will work, but cluster admins may prefer reduced permissions and change the workload definition to allow the restricted-v2 SCC and thus allow the namespace to be labeled as restricted.
Keep an eye out for more information around the 4.12 release as the GA date approaches [currently targeted for the beginning of December 2022.]
Best Practices Moving Forward
The primary recommendation is to ensure any workload has the security context properly set. Therefore, the best path is to ensure the workloads can run with restricted permissions. When this is done, your product will run on restricted namespaces on vanilla Kubernetes and will have the best chance of running properly on any other Kubernetes distribution with any vendor within restricted namespaces. The application will also have access to the restricted-v2 on OpenShift and will not be rejected by cluster administrators trying to enforce restrictions.
“My project requires escalated permissions!”
The best practice is to ensure the namespace containing your project is labeled accordingly. This way you will not be reliant on the label syncher.
You may either update your project to manage the namespace labels, or include the namespace labeling as a part of the manual installation instructions.
You can find code examples, tips and tools to check your operator solutions in the Operator SDK documentation.However, if you do not have an operator, this reference still provides helpful pointers. In addition, cluster admins will want to understand why your product requires raised permissions. Please ensure that you properly describe the reasons. You can add this information and the prerequisites in the description of your Operator within the CSV file of your metadata bundle.
- The Kubernetes restricted SCC requires workloads to set seccompProfile=runtime/default. However, in OCP 4.10 and earlier, setting the seccompProfile explicitly disqualified the workload from the restricted SCC. To be compatible with 4.10 and 4.11, the seccompProfile value must be left unset (the SCC itself will default it to runtime/default so it is ok to leave it empty).
- In certain configurations, your workloads may comply with the restricted Kubernetes definition but will not be accepted under the SCC restricted-v2 in OCP. In Kubernetes you can specify the runAsUser and get the Pod/container running in a restricted namespace. However, for Openshift’s restricted/restricted-v2 SCC you MUST leave the runAsUser field empty, or provide a value that falls within the specific user range for the namespace. Otherwise, you will only be able to run the Pod if it has access to the SCC nonroot-v2 or anyuid. If the image used requires a user, the best option is to ensure that the userID is properly defined in the image and not via the security context.
If you have any questions related to the guidance in this article, please reach out to email@example.com to connect with our Engineering Partner Management team.