Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.
Endigest AI Core Summary
BMW Group and Google Cloud built an automated workflow for fine-tuning, optimizing, and evaluating small language models (SLMs) for in-vehicle deployment.
•Cloud-based LLMs are impractical for in-vehicle use due to network dependency, making on-device SLMs a better fit for automotive edge deployment
•Compression techniques explored include quantization (32-bit to 4/8-bit), pruning, and knowledge distillation to reduce model size
•Post-compression quality is recovered via LoRA fine-tuning and reinforcement learning methods such as PPO, DPO, and GRPO
•Model quality is evaluated using point-wise metrics (ROUGE, BLEU) and pair-wise methods (LLM-as-a-judge or human feedback)
•The automated pipeline is built on Vertex AI Pipelines, enabling reproducible experimentation across the full configuration space with versioned artifacts
This summary was automatically generated by AI based on the original article and may not be fully accurate.