Built-in Helm Chart¶

The k8s_sandbox package includes a built-in Helm chart named agent-env in resources/helm/.

About the chart¶

The built-in Helm chart is designed to take a values.yaml (or helm-values.yaml) file which is structured much like a Docker compose.yaml file.

This is to simplify the complexities of Kubernetes manifests for users who are not familiar with them.

Tip

Automatic translation from basic compose.yaml files is supported.

In addition to the info below, see agent-env/README.md and agent-env/values.yaml for a full list of configurable options.

Container runtime class (gVisor)¶

The default container runtime class name for every service is gvisor which, (depending on your cluster - see remote cluster setup page) should map to the runsc runtime handler. You can override this if required.

services:
  default:
    image: ubuntu:24.04
    command: ["tail", "-f", "/dev/null"]
    runtimeClassName: runc

The above example assumes you have a RuntimeClass named runc deployed to your cluster which maps to the runc runtime handler.

See the gVisor page for considerations and limitations on using gVisor versus runc.

Use the cluster's default runtime class

You can cause the runtime class in the Pod spec to not be set at all by setting runtimeClassName to the CLUSTER_DEFAULT magic string in the relevant services.

services:
  default:
    image: ubuntu:24.04
    command: ["tail", "-f", "/dev/null"]
    runtimeClassName: CLUSTER_DEFAULT

This has the effect of using your cluster's default runtime class.

This approach is not recommended as it makes your evals cluster-dependent and therefore less portable. It is preferable to explicitly state which runtime class you require if it is not gVisor. See the remote cluster setup page for more information on installing runtimes and deploying RuntimeClass objects which map a name to a runtime handler.

View your cluster's runtime classes

You can view the runtime classes available in your cluster by running the following command:

kubectl get runtimeclass

NAME     HANDLER   AGE
gvisor   runsc     42d
runc     runc      42d

Aren't containerd or CRI-O the container runtimes?

There are multiple "levels" of container runtime in Kubernetes. containerd or CRI-O are the "high level" CRI implementations which Kubernetes uses to manage containers. The discussion in this section concerning runtimeClassName field on Pod spec is about the "lower level" OCI runtimes (like runc or runsc) which are used to actually run the container processes.

Internet access¶

By default, containers will not be able to access the internet which is an important security measure when running untrusted LLM-generated code. If you wish to allow limited internet access, there are 3 methods, each of which influence the Cilium Network Policy.

Populate the allowDomains list in your values.yaml with one or more Fully Qualified Domain Names. The following example list allows agents to install packages from a variety of sources:

services:
  default:
    image: ubuntu:24.04
    command: ["tail", "-f", "/dev/null"]
allowDomains:
  - "pypi.org"
  - "files.pythonhosted.org"
  - "bitbucket.org"
  - "github.com"
  - "raw.githubusercontent.com"
  - "*.debian.org"
  - "*.kali.org"
  - "kali.download"
  - "archive.ubuntu.com"
  - "security.ubuntu.com"
  - "mirror.vinehost.net"
  - "*.rubygems.org"

Note

An entry of e.g. aisi.org won't allow access to the subdomain of www.aisi.org. Either also include www.aisi.org, or if you want to provide access to all subdomains, use a wildcard: *.aisi.org.

Populate the allowCIDR list with one or more CIDR ranges. These are translated to toCIDRs entries in the Cilium Network Policy:
```
allowCIDR:
  - "8.8.8.8/32"
```
Populate the allowEntities list with one or more entities. These get translated to toEntities entries in the Cilium Network Policy:
```
allowEntities:
  - "world"
```

DNS¶

The built-in Helm chart is designed to allow services to communicate with each other using their service names e.g. curl nginx, much like you would in Docker Compose.

To make services discoverable by their service name, set the dnsRecord key to true.

Additionally, you can specify a list of domains that resolve to a given service e.g. curl example.com could resolve to your nginx service.

services:
  default:
    image: ubuntu:24.04
    command: ["tail", "-f", "/dev/null"]
  nginx:
    image: nginx:1.27.2
    dnsRecord: true
    additionalDnsRecords:
      - example.com

To achieve this, whilst maintaining support for deploying multiple instances of the same chart in a given namespace, a CoreDNS "sidecar" container is deployed in each service Pod.

Why not deploy CoreDNS as its own Pod?

Because of the way that Cilium caches DNS responses to determine which IP addresses correspond to the FQDN allowlist, CoreDNS service must be co-located on the same node which the container making the DNS request are on.

Why not use hostAliases to edit /etc/hosts?

Instead of using DNS, the /etc/hosts file could be modified using HostAliases. However, some tools which an agent might try to use (e.g. nslookup) do not respect /etc/hosts and will use DNS instead. Therefore, we chose to use a DNS-based approach.

For the containers within your release to use this, rather than the default Kubernetes DNS service, the /etc/resolv.conf of your containers is modified to use 127.0.0.1 as the nameserver.

Note that the CoreDNS sidecar only binds to 127.0.0.1 so won't be accessible from outside the Pod.

CoreDNS is used over Dnsmasq for 2 reasons:

CoreDNS is the default DNS server in Kubernetes.
It allows you to map domains to known service names whilst delegating resolving the actual IP address to the default Kubernetes DNS service.

Headless services¶

Any entry under the services key in values.yaml which sets dnsRecord: true or additionalDnsRecords will have a headless Service created for it.

This creates a DNS record in the Kubernetes cluster which resolves to the Pod's IP addresses directly. This allows agents to use tools like netcat or ping to explore their environment. If the service were not headless (i.e. it had a ClusterIP), tools like netcat would only be able to connect to the ports explicitly exposed by the service.

Readiness probes¶

Avoid making assumptions about the order in which your containers will start, or the time it will take for them to become ready.

Kubernetes knows when a container has started, but it does not know when the container is ready to accept traffic. In the case of containers such as web servers, there may be a delay between the container starting and the web server being ready to accept traffic.

Use readiness probes as a test to determine if your container is ready to accept traffic. The eval will not begin until all readiness probes have passed.

services:
  nginx:
    image: nginx:1.27.2
    readinessProbe:
      httpGet:
        path: /
        port: 80
      initialDelaySeconds: 5
      periodSeconds: 5

You do not need to specify a readinessProbe on containers which have a trivial entrypoint (e.g. sleep infinity or tail -f /dev/null).

Resource requests and limits¶

Default resource limits are assigned for each service within the Helm chart. The limits and requests are equal such that the Pods have a Guaranteed QoS class.

resources:
  limits:
    memory: "2Gi"
    cpu: "500m"
  requests:
    memory: "2Gi"
    cpu: "500m"

These can optionally be overridden for each service.

services:
  default:
    image: ubuntu:24.04
    command: ["tail", "-f", "/dev/null"]
    resources:
      limits:
        memory: "1Gi"
        cpu: "250m"
      requests:
        memory: "1Gi"
        cpu: "250m"

If overriding them, do consider the implications on the QoS class of the Pods and cluster utilization.

Volumes¶

The built-in Helm chart aims to provide a simple way to define and mount volumes in your containers, much like you can in Docker Compose. The following example creates an empty volume which is mounted in both containers.

services:
  default:
    image: ubuntu:24.04
    command: ["tail", "-f", "/dev/null"]
    volumes:
      - "my-volume:/my-volume-mount-path"
  worker:
    image: ubuntu:24.04
    command: ["tail", "-f", "/dev/null"]
    volumes:
      - "my-volume:/my-volume-mount-path"
volumes:
  my-volume:

Note that this does not allow you to mount directories from the client system (your machine) into the containers.

This requires that your cluster has a Storage Class named nfs-csi which supports the ReadWriteMany access mode for PersistentVolumeClaim. See Remote Cluster.

You can override the spec of the PersistentVolumeClaim to suit your needs.

volumes:
  my-volume:
    spec:
      storageClassName: azurefile
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi

Networks¶

By default, all Pods in a Kubernetes cluster can communicate with each other. Network policies are used to restrict this communication. The built-in Helm chart restricts Pods to only communicate with other Pods in the same Helm release.

Additional restrictions can be put in place, for example, to simulate networks within your eval.

services:
  default:
    image: ubuntu:24.04
    command: ["tail", "-f", "/dev/null"]
    networks:
      - a
  intermediate:
    image: ubuntu:24.04
    command: ["tail", "-f", "/dev/null"]
    dnsRecord: true
    networks:
      - a
      - b
  target:
    image: ubuntu:24.04
    command: ["tail", "-f", "/dev/null"]
    dnsRecord: true
    networks:
      - b
networks:
  a:
  b:

In the example above, the default service can communicate with the intermediate service, but not the target service. The intermediate service can communicate with the target service.

When no networks are specified all Pods within a Helm release can communicate with each other. If any networks are specified, any Pod which does not specify a network will not be able to communicate with any other Pod.

Additional resources¶

You can pass arbitrary Kubernetes resources to the Helm chart using the additionalResources key in the values.yaml file.

additionalResources:
  - apiVersion: v1
    kind: Secret
    metadata:
      name: my-secret
    type: Opaque
    data:
      password: my-password

You can also use Helm templating in additionalResources.

additionalResources:
  - apiVersion: v1
    kind: Secret
    metadata:
      name: '{{ template "agentEnv.fullname" $ }}-secret'
    type: Opaque
    data:
      password: my-password

For more complex resources which are not valid standalone YAML, you can use a string block. The following example creates a Cilium Network Policy which allows ingress from all entities to the default service on port 2222.

additionalResources:
- |
  apiVersion: cilium.io/v2
  kind: CiliumNetworkPolicy
  metadata:
    name: {{ template "agentEnv.fullname" $ }}-sandbox-default-external-ingress
    annotations:
      {{- toYaml $.Values.annotations | nindent 6 }}
  spec:
    description: |
      Allow external ingress from all entities to the default service on port 2222.
    endpointSelector:
      matchLabels:
        io.kubernetes.pod.namespace: {{ $.Release.Namespace }}
        {{- include "agentEnv.selectorLabels" $ | nindent 6 }}
        inspect/service: default
    ingress:
      - fromEntities:
        - all
        toPorts:
        - ports:
          - port: "2222"
            protocol: TCP

Annotations and labels¶

You can pass arbitrary annotations and labels to the Helm chart using the top-level annotations and labels keys in the values.yaml file. These will be added as annotations and labels to the Pods, PVCs and network policies.

annotations:
  my-annotation: my-value
labels:
  my-label: my-value

The k8s_sandbox package automatically includes the Inspect task name as an annotation. This may be useful for determining which task a Pod belongs to.

Render chart without installing¶

helm template ./resources/helm/agent-env > scratch/rendered.yaml

Install chart manually¶

Normally, the Helm chart is installed by the k8s_sandbox package. However, you can install it manually which might be useful for debugging.

helm install my-release ./resources/helm/agent-env -f path/to/helm-values.yaml

my-release is the release name. Consider using your name or initials.

./resources/helm/agent-env is the path to the Helm chart. If you are outside the k8s_sandbox repository, pass the path to the chart in the k8s_sandbox repository, or the path to the chart within your virtual environment's k8s_package/ directory.

helm install my-release ~/repos/k8s_sandbox/resources/helm/agent-env -f ...
helm install my-release \
  .venv/lib/python3.12/site-packages/k8s_sandbox/resources/helm/agent-env -f ...

Remember to uninstall the release when you are done.

helm uninstall my-release

Generate docs¶

You can regenerate docs using helm-docs after changing the default values.yaml.

brew install norwoodj/tap/helm-docs
helm-docs ./resources/helm/agent-env

This is done automatically by a pre-commit hook.