This post describes a 24-hour speedrun for training a text-to-image diffusion model using 32 H200 GPUs and a ~$1500 compute budget.
This post explains MoE architecture and how transformers v5 added first-class MoE support.
Hugging Face announces that GGML, the team behind llama.cpp, is joining HF to support the long-term growth of local AI inference.
This post explains how to fine-tune small LLMs for free using Unsloth and Hugging Face Jobs, with support for coding agents like Claude Code and Codex.
IBM Research and UC Berkeley applied the MAST (Multi-Agent System Failure Taxonomy) framework to diagnose why LLM agents fail in enterprise IT automation, analyzing 310 ITBench SRE traces across three models.
Gradio 6's gr.HTML now supports custom templates, scoped CSS, and JavaScript interactivity, enabling full-stack web app development in a single Python file.
This post introduces an agent skill that enables coding agents (Claude and Codex) to write production-ready CUDA kernels for HuggingFace's diffusers and transformers libraries.
This post introduces OpenEnv, an open-source framework for evaluating AI agents in real-world environments, using a calendar management benchmark called the Calendar Gym.
Transformers.js v4 preview is now available on NPM, bringing a new WebGPU runtime, build system overhaul, and expanded model support.