Latest Engineering Articles
Explore real-world engineering experiences from top tech companies.
Explore real-world engineering experiences from top tech companies.
We are open-sourcing the initial version of RCCLX – an enhanced version of RCCL that we developed and tested on Meta’s internal workloads. RCCLX is fully integrated with Torchcomms and aims to empower researchers and developers to accelerate innovation, regardless of their chosen backend. Communication patterns for AI models are constantly evolving, as are hardware [...] Read More... The post RCCLX: Innovating GPU communications on AMD platforms appeared first on Engineering at Meta.
One engineer used AI to rebuild Next.js on Vite in a week. vinext builds up to 4x faster, produces 57% smaller bundles, and deploys to Cloudflare Workers with a single command.
AWS Elemental Inference is a fully managed AI service that automatically transforms live and on-demand video broadcasts into vertical formats optimized for mobile and social platforms in real time, enabling broadcasters to reach audiences on TikTok, Instagram Reels, and YouTube Shorts without manual editing or AI expertise.
Most agents today run generated code with full access to your secrets. As more agents adopt coding agent patterns, where they read filesystems, run shell commands, and generate code, they're becoming multi-component systems that each need a different level of trust. While most teams run all of these components in a single security context, because that's how the default tooling works, we recommend thinking about these security boundaries differently. Below we walk through: More agents are adopting the coding agent architecture. These agents read and write to a filesystem. They run bash, Python, or similar programs to explore their environment. And increasingly, agents generate code to solve particular problems. Even agents that aren't marketed as "coding agents" use code generation as their most flexible tool. A customer support agent that generates and runs SQL to look up account data is using the same pattern, just pointed at a database instead of a filesystem. An agent that can write and execute a script can solve a broader class of problems than one limited to a fixed set of tool calls. Consider an agent debugging a production issue. The agent reads a log file containing a crafted prompt injection. The injection tells the agent to write a script that sends the contents of and to an external server. The agent generates the script, executes it, and the credentials are gone.~/.ssh~/.aws/credentials This is the core risk of the coding agent pattern. Prompt injection gives attackers influence over the agent, and code execution turns that influence into arbitrary actions on your infrastructure. The agent can be tricked into exfiltrating data from the agent's own context, generating malicious software, or both. That malicious software can steal credentials, delete data, or compromise any service reachable from the machine the agent runs on. The attack works because the agent, the code the agent generates, and the infrastructure all share the same level of access. To draw boundaries in the right places, you need to understand what these components are and what level of trust each one deserves. An agentic system has four distinct actors, each with a different trust level. The agent is the LLM-driven runtime defined by its context, tools, and model. The agent runs inside an agent harness, which is the orchestration software, tools, and connections to external services that you build and deploy through a standard SDLC. You can trust the harness the same way you'd trust any backend service, but the agent itself is subject to prompt injection and unpredictable behavior. Information should be revealed on a need-to-know basis, i.e. an agent doesn't need to see database credentials to use a tool that executes SQL. Agent secrets are the credentials the system needs to function, including API tokens, database credentials, and SSH keys. The harness manages these responsibly, but they become dangerous when other components can access them directly. The entire architecture discussion below comes down to which components have a path to these secrets. The programs the agent creates and executes are the wildcard. Generated code can do anything the language runtime allows, which makes it the hardest actor to reason about. These programs may need credentials to talk to outside services, but giving generated code direct access to secrets means any prompt injection or model error can lead to credential theft. The filesystem and broader environment are whatever the system runs on, whether a laptop, a VM, or a Kubernetes cluster. The environment can trust the harness, but it cannot trust the agent to have full access or run arbitrary programs without a security boundary. These four actors exist in every agentic system. The question is whether you draw security boundaries between them or let them all run in the same trust domain. A few design principles follow from these trust levels: With these actors and principles in mind, here are the architectures we see in practice, ordered from least to most secure. Coding agents like Claude Code and Cursor ship with sandboxes, but these are often off by default. In practice, many developers run agents with no security boundaries. In this architecture, there are no boundaries between any of the four actors. The agent, the agent's secrets, the filesystem, and generated code execution all share a single security context. On a developer's laptop, that means the agent can read files and SSH keys. On a server, it means access to environment variables, database credentials, and API tokens. Generated code can steal any of these, delete data, and reach any service the environment can reach. The harness may prompt the user for confirmation before certain actions, but there is no enforced boundary once a tool runs..env A sits outside the main security boundary and intercepts outbound network traffic, injecting credentials only as requests travel to their intended endpoint. The harness configures the proxy with the credentials and the domain rules, but the generated code never sees the raw secret values.secret injection proxy The proxy prevents exfiltration. Secrets can't be copied out of the execution context and reused elsewhere. But the proxy doesn't prevent misuse during active runtime. Generated software can still make unexpected API calls using the injected credentials while the system is running. Secret injection is a backward-compatible path from a zero-boundaries architecture. You can add the proxy without restructuring how components run. The tradeoff is that the agent and generated code still share the same security context for everything except the secrets themselves. A natural instinct is to wrap the agent harness and the generated code in a shared VM or sandbox. A shared sandbox isolates both from the broader environment, and that's genuinely useful. Generated programs can't infiltrate the wider infrastructure. But in a shared sandbox, the agent and generated program still share the same security context. The generated code can still steal the harness's credentials or, if a secret injection proxy is in place, misuse credentials through the proxy. The sandbox protects the environment from the agent, but doesn't protect the agent from the agent's own generated code. The missing piece is running the agent harness and the programs the agent generates on independent compute, in separate VMs or sandboxes with distinct security contexts. The harness and the harness's secrets live in one context. The filesystem and generated code execution live in another, with no access to the agent's secrets. Both Claude Code and Cursor offer sandboxed execution modes today, but adoption in desktop environments has been low because sandboxing can cause compatibility issues. In the cloud, this separation is more practical. You can give the generated code a VM tailored for the type of software the agent needs to run, which can actually improve compatibility. In practice, this separation is a straightforward change. Agents perform tool invocations through an abstraction layer, and that abstraction makes it natural to route code execution to a separate environment without rewriting the agent itself. These two workloads have very different compute profiles, which means separating them lets you optimize each one independently. The agent harness spends most of its time waiting on LLM API responses. On Vercel, is a natural fit for this workload because billing pauses during I/O and only counts active CPU time, which keeps costs proportional to actual work rather than billing idle time.Fluid compute Generated code has the opposite profile. Agent-created programs are short-lived, unpredictable, and untrusted. Each execution needs a clean, isolated environment so that one program can't access secrets or state left behind by another. Sandbox products like provide this through ephemeral Linux VMs that spin up per execution and are destroyed afterward. The VM boundary is what enforces the security context separation. Generated code inside the sandbox has no network path to the harness's secrets and no access to the host environment.Vercel Sandbox The sandbox works in both directions. The sandbox shields the agent's secrets from generated code, and shields the broader environment from whatever the generated code does. The strongest architecture combines the application sandbox with secret injection. The combination gives you two properties that neither achieves alone: For production agentic systems, we recommend combining both. The agent harness runs as trusted software on standard compute. Generated code runs in an isolated sandbox. Secrets are injected at the network level, never exposed where generated code could access the secrets directly. This separation of agent compute from sandbox compute will become the standard architecture for agentic systems. Most teams haven't made this shift yet because the default tooling doesn't enforce it. The teams that draw these boundaries now will have a meaningful security advantage as agents take on more sensitive workloads. Safe secret injection is now .available on Vercel Sandbox Read more The actors in agentic systems Where security boundaries should go between them An architecture for running agent and generated code in separate contexts The harness should never expose its own credentials to the agent directly The agent should access capabilities through scoped tool invocations, and those tools should be as narrow as possible. An agent performing support duties for a specific customer should receive a tool scoped to that customer's data, not a tool that accepts a customer ID parameter, since that parameter is subject to prompt injection. Generated programs that need their own credentials are a separate concern, which the architectures below address Full isolation between the agent harness and generated programs, each running in their own security context No direct access to credentials for the generated code, which can use secrets through the injection proxy while running but can't read or exfiltrate them. Injected headers overwrite any headers the sandbox code sets with the same name, preventing credential substitution attacks. All agents are starting to look like coding agents What goes wrong without boundaries Four actors in an agentic system Zero boundaries: today's default Secret injection without sandboxing Separating agent compute from sandbox compute Application sandbox with secret injection Agent Agent secrets Generated code execution Filesystem Why sandboxing everything together isn't enough
GPT 5.3 Codex is now available on AI Gateway. GPT 5.3 Codex brings together the coding strengths of GPT-5.2-Codex and the reasoning depth of GPT-5.2 in a single model that's 25% faster and more token-efficient. Built for long-running agentic work, the model handles research, tool use, and multi-step execution across the full software lifecycle, from debugging and deployment to product documents and data analysis. Additionally, you can steer it mid-task without losing context. For web development, it better understands underspecified prompts and defaults to more functional, production-ready output. To use this model, set model to in the AI SDK.openai/gpt-5.3-codex AI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in , support, and intelligent provider routing with automatic retries.observabilityBring Your Own Key Learn more about , view the or try it in our .AI GatewayAI Gateway model leaderboardmodel playground Read more
The for using the Python runtime is now 500MB, increasing the maximum uncompressed deployment bundle size from 250MB.bundle size limitVercel Functions Learn more in the , or deploy or on Vercel to get started.functions limitations documentationFastAPIFlask Read more
is now available, enabling developers to build and deploy Slack agents in a single session with their coding agent of choice.The Slack Agent Skill The skill handles the complexity of OAuth configuration, webhook handlers, event subscriptions, and deployment so you can focus on what your agent should do rather than on infrastructure setup. The wizard walks through five stages: Install the skill and run the wizard by invoking it in your coding agent (for example, in Claude Code)./slack-agent new Try the to make your custom agent or use the to deploy right away and customize later.skillSlack Agent Template Read more Choose your LLM provider and initialize from the Slack Agent TemplateProject setup: Generate a customized app manifest and create the app in Slack's consoleSlack app creation: Set up signing secrets, bot tokens, and API keys with validationEnvironment configuration: Run locally with ngrok and verify the integrationLocal testing: Deploy to Vercel with environment variables configured automaticallyProduction deployment:
Last week, my team met many developers at Developer Week in San Jose. My colleague, Vinicius Senger delivered a great keynote about renascent software—a new way of building and evolving applications where humans and AI collaborate as co-developers using Kiro. Other colleagues spoke about building and deploying production-ready AI agents. Everyone stayed to ask and […]
Support for the legacy config file will be officially removed on . Migrate existing files by renaming them to no other content changes are required.now.jsonnow.jsonvercel.json,March 31st, 2026 For more advanced use cases, try for programmatic project configuration.vercel.ts Learn more about configuring projects with in the .vercel.jsondocumentation Read more
can now automatically inject HTTP headers into outbound requests from sandboxed code. This keeps API keys and tokens safely outside the sandbox VM boundary, so apps running inside the sandbox can call authenticated services without ever accessing the credentials. Header injection is configured as part of the network policy using . When the sandbox makes an HTTPS request to a matching domain, the firewall adds or replaces the specified headers before forwarding the request.Vercel Sandboxtransform This is designed for AI agent workflows where prompt injection is a real threat. Even if an agent is compromised, there's nothing to exfiltrate, as the credentials only exist in a layer outside the VM. Injection rules work with all egress network policy configurations, including open internet access. To allow general traffic while injecting credentials for specific services: Like all network policy settings, injection rules can be updated on a running sandbox without restarting it. This enables multi-phase workflows, inject credentials during setup, then remove them before running untrusted code: Available to all Pro and Enterprise customers. Learn more in the .documentation Read more Live updates Key highlights Header overwrite: Injection applies to HTTP headers on outbound requests. Full replacement: Injected headers overwrite any existing headers with the same name set by sandbox code, preventing the sandbox from substituting its own credentials. Domain matching: Supports exact domains and wildcards (e.g., *.github.com). Injection only triggers when the outbound request matches. Works with all policies: Combine injection rules with , or domain-specific allow lists. allow-all
