Arm SME2 and Google AI Edge enable efficient on-device AI inference by combining hardware acceleration with optimized software tools.
Genkit introduces a middleware system for intercepting and customizing AI generation calls in agentic applications.
This tutorial demonstrates how to build long-running AI agents using Google's Agent Development Kit that maintain state across weeks without losing context.
This article presents DFlash, a diffusion-style speculative decoding method that achieves 3x speedups for LLM inference on Google TPUs by generating entire token blocks in a single forward pass instead of sequentially.
Gemini Embedding 2 is a multimodal embedding model that maps text, images, video, audio, and documents into a single semantic space.
Google Cloud announces Rapid Bucket, integrating Colossus storage with PyTorch for accelerated AI/ML training.
LiteRT is a cross-platform on-device AI framework that leverages Neural Processing Units (NPUs) to enable fast, efficient AI features on mobile, desktop, and IoT devices.
Google Cloud introduces Agents CLI in Agent Platform, a unified tool designed to streamline AI agent development from local environment to production deployment.
This article covers five key lessons for building production-ready AI agents, using a refactored sales research agent as an example.
A2UI v0.9 is a framework-agnostic standard for generative UI that enables AI agents to generate UI components in real-time using a client's existing design system and component catalog.
MaxText introduces Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) capabilities now available on single-host TPU configurations for post-training large language models.
Google Pay API introduces new enhancements for merchant initiated transactions, supporting recurring subscriptions, deferred payments, and automatic reloads with improved flexibility and control.