Guardrails at the gateway: Securing AI inference on GKE with Model Armor

2026-04-09

6 min read

by Sunny Song

Tags:

AI & Machine Learning

Containers & Kubernetes

Security & Identity

Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.

This article explains how to secure AI inference workloads on GKE using Model Armor as a network-level guardrail against AI-specific attack vectors.

•Relying on LLM internal safety alone is insufficient: refusals are opaque, non-customizable, and appear as HTTP 200 in security logs
•Model Armor integrates via GKE Service Extensions, inspecting traffic before and after inference without application code changes
•Input scrutiny blocks prompt injection, jailbreak attempts, and malicious URLs before reaching GPU/TPU nodes
•Output moderation filters hate speech, dangerous content, and scans for PII leakage via Google Cloud DLP
•Blocked requests return HTTP 400 with structured logs in Security Command Center, enabling full attack audit visibility

Related Articles