Holotron-12B - High Throughput Computer Use Agent

2026-03-17

1 min read

Read Original

Get the latest tech trends every morning

Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.

Endigest AI Core Summary

H Company releases Holotron-12B, a multimodal computer-use agent model post-trained from NVIDIA's Nemotron-Nano-2 VL, optimized for high-throughput production inference.

•Built on a hybrid State-Space Model (SSM) and attention architecture, avoiding the quadratic cost of full attention and reducing memory footprint to a constant state per layer
•Achieved over 2x higher throughput than Holo2-8B on a single H100 GPU using vLLM v0.14.1, reaching 8.9k tokens/s at concurrency of 100
•Trained in two stages: supervised fine-tuning on proprietary localization and navigation data (~14B tokens), focusing on screen understanding, grounding, and UI interactions
•WebVoyager benchmark score improved from 35.1% (base Nemotron) to 80.5%, surpassing Holo2-8B
•

Holotron-12B - High Throughput Computer Use Agent

Get the latest tech trends every morning

Endigest AI Core Summary

Related Articles

Developer's guide to Gemini Enterprise and A2UI integration

Boston Children’s uses AI to unlock new diagnoses

How Braintrust turns customer requests into code with Codex

May 29, 2026