Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.
Endigest AI Core Summary
A.T.L.A.S (Adaptive Test-time Learning and Autonomous Specialization) achieves 74.6% LiveCodeBench pass@1 with a frozen 14B model on a single $500 consumer GPU, outperforming Claude 4.5 Sonnet (71.4%) at ~$0.004/task vs ~$0.066/task.
•Uses a three-phase pipeline: PlanSearch + BudgetForcing for diverse candidate generation, Geometric Lens energy scoring for candidate selection, and PR-CoT self-verified iterative repair
•Runs a frozen Qwen3-14B-Q4_K_M model on an RTX 5060 Ti 16GB via a patched llama-server on K3s with speculative decoding (~100 tok/s) and 5120-dim self-embeddings
•Phase 3 PR-CoT repair rescued 36/42 failed tasks (85.7%) using model-generated test cases, contributing +7.3pp to the final score
•Cost is local electricity only (~$0.004/task at $0.12/kWh), with no API calls, no data leaving the machine, and no fine-tuning required
•Known limitations include LCB-only optimization, an undertrained Geometric Lens C(x) (only ~60 training samples), and single-threaded task pr
This summary was automatically generated by AI based on the original article and may not be fully accurate.