Enforcing Security in Confidential Containers with Kata Agent Policies

How to control Pod execution inside a TEE

6 min readFeb 27, 2025

The CNCF Confidential Containers (CoCo) project enables deploying a Kubernetes pod inside a Trusted Execution Environment (TEE). This approach makes confidential computing adoption easier for, but it also comes with a few challenges, especially regarding the pod’s dynamic nature. Containers in the pod can be created, deleted, and updated via making Remote Procedure Calls (RPC) to the container runtime. So, how can we ensure that nothing unintended will be executed in it that would undermine confidentiality?

Magnus’s blog, “Policing the Sandbox “, discusses this topic in detail. The recommended solution is to utilise the Kata agent policy feature. In this blog, I’ll deep dive into the the agent policy feature.

Let’s look at the high-level architecture of a typical CoCo deployment in Kubernetes.

Author created

The diagram shows that a CoCo pod uses a Linux Confidential Virtual Machine (CVM) executing inside a TEE (the secure enclave) as the foundation. The use of CVM enables the lift-and-shift of existing container workloads.

The container images related to the pod are downloaded and kept inside the CVM and can be signed and/or encrypted.

The components within the CVM — namely image-rs, kata-agent, the Confidential Data Hub (CDH), and the Attestation Agent (AA) — are collectively referred to as the enclave software stack.

An external entity, the “Relying Party Services,” is responsible for attesting the CVM and releasing secret resources. For detailed information related to attestation, refer to the following documentation.

Our focus for this article is the communication between the kata runtime and the kata-agent.

The kata runtime makes remote procedure calls (RPC) to the kata-agent inside the CVM for creating and managing the life-cycle of the containers. Further, the kata-agent provides the capability to define an allowlist of kata-agent RPC API requests. This is the agent policy feature. It is a set of rules that define what RPC API requests can execute. The policies are written in the Rego language and kata-agent uses regorus library to evaluate the policy.

The diagram below illustrates the key components, with some omitted for brevity.

Author created

Structure of an Agent Policy file

An agent policy document is typically structured into three main components:

package agent_policy

//1. defaults

//2. rules

//3. policy data
  1. Defaults: These define the default allow or deny values for different RPC API requests.
  2. Rules: These are rego rules
  3. Policy Data: This section provides configuration values for the rules

Defaults

Here is an example default list. Note that CreateContainerRequest and ExecProcessRequest is disabled (set to false)

default CreateContainerRequest := false
default ExecProcessRequest := false
default CopyFileRequest := true

Rules

Rules enable overriding the default allow or deny values for the RPC requests. The rules typically compare the input parameters of an RPC API request with values from the policy data. Based on this comparison, a rule can either allow or deny the request by returning true or false. Policy rules are optional.

Examples of rules corresponding to the kata-agent’s CreateContainer and ExecProcess requests:

import future.keywords.in
import future.keywords.if
import future.keywords.every


CreateContainerRequest if {
every storage in input.storages {
some allowed_image in policy_data.allowed_images
storage.source == allowed_image
}
}

ExecProcessRequest if {
input_command = concat(" ", input.process.Args)
some allowed_command in policy_data.allowed_commands
input_command == allowed_command
}

The input data in the rules is provided by the kata-agent to regorus library as a JSON format representation of the API request parameters. The input data is compared against the reference values provided by the policy data (policy_data).

Policy data

It contains the reference values that are compared by the Policy rules with the input parameters of the RPC API request. Policy data is optional.

Example of Policy data:

policy_data := {    
"allowed_commands": [
"ps"
],
"allowed_images": [
"pause",
"quay.io/fedora/fedora:latest"
]
}

A complete rego policy document looks like this:

package agent_policy

import future.keywords.in
import future.keywords.if
import future.keywords.every

default AddARPNeighborsRequest := true
default AddSwapRequest := true
default CloseStdinRequest := true
default CopyFileRequest := true
default CreateSandboxRequest := true
default DestroySandboxRequest := true
default GetMetricsRequest := true
default GetOOMEventRequest := true
default GuestDetailsRequest := true
default ListInterfacesRequest := true
default ListRoutesRequest := true
default MemHotplugByProbeRequest := true
default OnlineCPUMemRequest := true
default PauseContainerRequest := true
default PullImageRequest := true
default ReadStreamRequest := true
default RemoveContainerRequest := true
default RemoveStaleVirtiofsShareMountsRequest := true
default ReseedRandomDevRequest := true
default ResumeContainerRequest := true
default SetGuestDateTimeRequest := true
default SetPolicyRequest := true
default SignalProcessRequest := true
default StartContainerRequest := true
default StartTracingRequest := true
default StatsContainerRequest := true
default StopTracingRequest := true
default TtyWinResizeRequest := true
default UpdateContainerRequest := true
default UpdateEphemeralMountsRequest := true
default UpdateInterfaceRequest := true
default UpdateRoutesRequest := true
default WaitProcessRequest := true
default WriteStreamRequest := true

default CreateContainerRequest := false
default ExecProcessRequest := false


CreateContainerRequest if {
every storage in input.storages {
some allowed_image in policy_data.allowed_images
storage.source == allowed_image
}
}

ExecProcessRequest if {
input_command = concat(" ", input.process.Args)
some allowed_command in policy_data.allowed_commands
input_command == allowed_command
}

policy_data := {
"allowed_commands": [
"/bin/bash -c ls"
],
"allowed_images": [
"pause",
"quay.io/fedora/fedora@sha256:4d29104e4d6f0fb6fad0792e1cab6c44f574f2f3d6ff9e0de7737ab9c86b9d94"
]
}

You can use a rego linter like regal to check for errors in the policy file.

You can also use the genpolicy tool to autogenerate a policy based on the pod manifest file. You’ll need to adapt the generated policy file to suit your environment.

Using agent policy file with pods

You can embed the agent policy file in the VM root filesystem or provide it during pod creation via an annotation in the pod manifest.

When using an annotation, you need to base64 encode the policy. Two annotations are available for specifying the policy:

  1. io.katacontainers.config.runtime.cc_init_data
  2. io.katacontainers.config.agent.policy

The following doc describes the format of the cc_init_data annotation. The preferred way is to provide the policy file as part of the cc_init_data annotation since this is measured via remote attestation.

The policy file, when provided as part of cc_init_data annotation, looks like this:

algorithm = "sha384"
version = "0.1.0"

[data]
"policy.rego" = '''
package agent_policy

import future.keywords.in
import future.keywords.if
import future.keywords.every

default AddARPNeighborsRequest := true
default AddSwapRequest := true
default CloseStdinRequest := true
default CopyFileRequest := true
default CreateSandboxRequest := true
default DestroySandboxRequest := true
default GetMetricsRequest := true
default GetOOMEventRequest := true
default GuestDetailsRequest := true
default ListInterfacesRequest := true
default ListRoutesRequest := true
default MemHotplugByProbeRequest := true
default OnlineCPUMemRequest := true
default PauseContainerRequest := true
default PullImageRequest := true
default ReadStreamRequest := true
default RemoveContainerRequest := true
default RemoveStaleVirtiofsShareMountsRequest := true
default ReseedRandomDevRequest := true
default ResumeContainerRequest := true
default SetGuestDateTimeRequest := true
default SetPolicyRequest := true
default SignalProcessRequest := true
default StartContainerRequest := true
default StartTracingRequest := true
default StatsContainerRequest := true
default StopTracingRequest := true
default TtyWinResizeRequest := true
default UpdateContainerRequest := true
default UpdateEphemeralMountsRequest := true
default UpdateInterfaceRequest := true
default UpdateRoutesRequest := true
default WaitProcessRequest := true
default WriteStreamRequest := true

default CreateContainerRequest := false
default ExecProcessRequest := false


CreateContainerRequest if {
every storage in input.storages {
some allowed_image in policy_data.allowed_images
storage.source == allowed_image
}
}

ExecProcessRequest if {
input_command = concat(" ", input.process.Args)
some allowed_command in policy_data.allowed_commands
input_command == allowed_command
}

policy_data := {
"allowed_commands": [
"/bin/bash -c ls"
],
"allowed_images": [
"pause",
"quay.io/fedora/fedora@sha256:4d29104e4d6f0fb6fad0792e1cab6c44f574f2f3d6ff9e0de7737ab9c86b9d94"
]
}

'''

Example pod yaml:

apiVersion: v1
kind: Pod
metadata:
name: test
labels:
app: test
annotations:
io.katacontainers.config.runtime.cc_init_data: YWxnb3JpdGhtID0gInNoYTM4NCIKdmVyc2lvbiA9ICIwLjEuMCIKCltkYXRhXQoicG9saWN5LnJlZ28iID0gJycnCnBhY2thZ2UgYWdlbnRfcG9saWN5CgppbXBvcnQgZnV0dXJlLmtleXdvcmRzLmluCmltcG9ydCBmdXR1cmUua2V5d29yZHMuaWYKaW1wb3J0IGZ1dHVyZS5rZXl3b3Jkcy5ldmVyeQoKZGVmYXVsdCBBZGRBUlBOZWlnaGJvcnNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBBZGRTd2FwUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgQ2xvc2VTdGRpblJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IENvcHlGaWxlUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgQ3JlYXRlU2FuZGJveFJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IERlc3Ryb3lTYW5kYm94UmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgR2V0TWV0cmljc1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IEdldE9PTUV2ZW50UmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgR3Vlc3REZXRhaWxzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgTGlzdEludGVyZmFjZXNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBMaXN0Um91dGVzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgTWVtSG90cGx1Z0J5UHJvYmVSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBPbmxpbmVDUFVNZW1SZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBQYXVzZUNvbnRhaW5lclJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFB1bGxJbWFnZVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFJlYWRTdHJlYW1SZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZW1vdmVDb250YWluZXJSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZW1vdmVTdGFsZVZpcnRpb2ZzU2hhcmVNb3VudHNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZXNlZWRSYW5kb21EZXZSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZXN1bWVDb250YWluZXJSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBTZXRHdWVzdERhdGVUaW1lUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU2V0UG9saWN5UmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU2lnbmFsUHJvY2Vzc1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFN0YXJ0Q29udGFpbmVyUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU3RhcnRUcmFjaW5nUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU3RhdHNDb250YWluZXJSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBTdG9wVHJhY2luZ1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFR0eVdpblJlc2l6ZVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZUNvbnRhaW5lclJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZUVwaGVtZXJhbE1vdW50c1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZUludGVyZmFjZVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZVJvdXRlc1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFdhaXRQcm9jZXNzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgV3JpdGVTdHJlYW1SZXF1ZXN0IDo9IHRydWUKCmRlZmF1bHQgQ3JlYXRlQ29udGFpbmVyUmVxdWVzdCA6PSBmYWxzZQpkZWZhdWx0IEV4ZWNQcm9jZXNzUmVxdWVzdCA6PSBmYWxzZQoKCkNyZWF0ZUNvbnRhaW5lclJlcXVlc3QgaWYgewoJZXZlcnkgc3RvcmFnZSBpbiBpbnB1dC5zdG9yYWdlcyB7CiAgICAgICAgc29tZSBhbGxvd2VkX2ltYWdlIGluIHBvbGljeV9kYXRhLmFsbG93ZWRfaW1hZ2VzCiAgICAgICAgc3RvcmFnZS5zb3VyY2UgPT0gYWxsb3dlZF9pbWFnZQogICAgfQp9CgpFeGVjUHJvY2Vzc1JlcXVlc3QgaWYgewogICAgaW5wdXRfY29tbWFuZCA9IGNvbmNhdCgiICIsIGlucHV0LnByb2Nlc3MuQXJncykKCXNvbWUgYWxsb3dlZF9jb21tYW5kIGluIHBvbGljeV9kYXRhLmFsbG93ZWRfY29tbWFuZHMKCWlucHV0X2NvbW1hbmQgPT0gYWxsb3dlZF9jb21tYW5kCn0KCnBvbGljeV9kYXRhIDo9IHsgIAogICJhbGxvd2VkX2NvbW1hbmRzIjogWyAgIAoJIi9iaW4vYmFzaCAtYyBscyIKICBdLAogICJhbGxvd2VkX2ltYWdlcyI6IFsKICAgICJwYXVzZSIsCgkicXVheS5pby9mZWRvcmEvZmVkb3JhQHNoYTI1Njo0ZDI5MTA0ZTRkNmYwZmI2ZmFkMDc5MmUxY2FiNmM0NGY1NzRmMmYzZDZmZjllMGRlNzczN2FiOWM4NmI5ZDk0IgogIF0KfQoKJycnCg==
spec:
runtimeClassName: kata-remote
containers:
- name: test
image: quay.io/fedora/fedora@sha256:4d29104e4d6f0fb6fad0792e1cab6c44f574f2f3d6ff9e0de7737ab9c86b9d94
command:
- sleep
- "36000"
securityContext:
privileged: false
seccompProfile:
type: RuntimeDefault

Shown here are two exec invocations, where one is blocked due to the policy.

# Successful exec
$ kubectl exec -it test - /bin/bash -c ls
afs boot etc lib media opt root sbin sys usr
bin dev home lib64 mnt proc run srv tmp var

# Blocked exec
$ kubectl exec -it test - /bin/bash -c ps
error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "f76adbce6154b5bafe38daac7a1019bc3dc6c6af75f0f24d867caeac2ff7b21b": cannot enter container 2a36831a32e51b9e87bce5efb5759b9a1c87f0e3a1dcbea1fb66bb93389d3c46, with err rpc error: code = PermissionDenied desc = "ExecProcessRequest is blocked by policy: ": unknown

Likewise you can try other scenarios. For example you can change the container image and verify that the container creation fails with the new image.

Conclusion

The kata-agent policy feature provides a flexible mechanism to enforce security for the Kata API and is an essential building block for confidential containers. There is a bit of a learning curve involved when working with the policies, but it’s a powerful mechanism to secure the communication channel between kata runtime (untrusted) and kata-agent (trusted). If you have suggestions on usability, new features, etc., please do not hesitate to reach out.

--

--

Pradipta Banerjee
Pradipta Banerjee

Written by Pradipta Banerjee

Writes about technology | Product Development | Startup advisor & mentor. www.linkedin.com/in/bpradipt

No responses yet