Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.
Endigest AI Core Summary
Google DeepMind launched Gemma 4, a family of open models under Apache 2.0 license designed for on-device agentic AI experiences across a wide range of hardware.
•Gemma 4 supports multi-step planning, autonomous action, offline code generation, and audio-visual processing without specialized fine-tuning, with support for 140+ languages.
•Agent Skills, a new feature in Google AI Edge Gallery (iOS/Android), enables fully on-device multi-step agentic workflows such as querying Wikipedia, generating visualizations, and integrating with other models.
•LiteRT-LM runs Gemma 4 E2B in under 1.5GB memory using 2-bit/4-bit weights, supports constrained decoding, dynamic context up to 128K tokens, and processes 4,000 tokens in under 3 seconds.
•Performance benchmarks include 133 prefill/7.6 decode tokens/s on Raspberry Pi 5 CPU, and 3,700 prefill/31 decode tokens/s on Qualcomm Dragonwing IQ8 NPU.
•A new Python package and CLI tool (litert-lm) is available on Linux, macOS, and Raspberry Pi for
This summary was automatically generated by AI based on the original article and may not be fully accurate.