Grab developed Palana, a Kubernetes-native platform, to securely run autonomous AI agent workloads. This platform addresses the inherent security risks of non-deterministic, model-driven applications by providing isolated runtime environments, robust secrets management, and centralized egress control. It leverages Kubernetes custom resources for scalable and auditable management of agent lifecycles.
Read original on InfoQ ArchitectureThe rise of autonomous AI agents, capable of executing arbitrary tools and making API calls, introduces significant security challenges like prompt injection and logic hijacking. Grab's platform engineering and cybersecurity teams tackled these by building Palana, a proprietary, Kubernetes-native secure execution platform. Palana's core purpose is to provide a secure, isolated runtime environment for these non-deterministic, model-driven applications.
Palana implements a zero-trust model, establishing isolation as its fundamental security principle. Each AI agent is assigned to its own dedicated Kubernetes namespace. This namespace is configured with restrictive Role-Based Access Control (RBAC), custom network policies, and isolated service accounts to prevent lateral movement or impact to other workloads if one agent is compromised. Agents also receive persistent, localized storage for state preservation across container restarts during long-running workflows.
Traditional secrets management methods (e.g., environment variables) are too risky for autonomous agents. Palana decouples secrets into agent-readable credentials and proxy-only secrets. Sensitive credentials reside in HashiCorp Vault. Agents only receive abstract placeholder tokens. Outbound API calls are intercepted by a secure, intermediate proxy that validates the destination and dynamically replaces placeholders with real secrets, ensuring raw secrets never touch the agent container's environment or memory.
Egress Control Mechanism
All outbound HTTP/HTTPS traffic is routed through an Envoy proxy and an external authorization service running Open Policy Agent (OPA) rules. This setup enables real-time traffic decryption (using Man-in-the-Middle CA termination), header evaluation, endpoint validation, and token substitution, all while generating detailed audit trails. This centralized control point is crucial for monitoring and securing agent communications.
For critical operational controls, such as network-level kill switches and idle shutdowns, Palana ensures they reside entirely outside the agent's execution runtime. This prevents a compromised agent from preventing its own termination, reinforcing the platform's security posture.