Anthropic introduces Instructed-Retriever-1 to accelerate Knowledge Assistant search through parallel test-time operations.
- •Search latency drops 3x and answer generation 2x via parallel query generation and reranking
- •Single model handles query and evidence ranking in parallel, maintaining low latency
- •Trained on synthetic enterprise environments, matching Claude Sonnet 4.5 quality on KARLBench
- •Uses Mixture-of-Experts with FP8 quantization and speculative decoding for efficient serving
- •Achieves 81.0 nDCG@10 on real workloads with end-to-end latency under 10 seconds
This summary was automatically generated by AI based on the original article and may not be fully accurate.