Amazon SageMaker Inference now supports GA deployment of custom Amazon Nova models for production-grade inference.
- •Covers Nova Micro, Nova Lite, and Nova 2 Lite models with continued pre-training, SFT, or reinforcement fine-tuning
- •Uses EC2 G5/G6 instances (more cost-efficient than P5) with auto-scaling based on 5-minute usage patterns
- •Configurable parameters include context length, concurrency, batch size, temperature, top_p, and reasoning effort
- •Deploy via SageMaker Studio UI or SDK using create_model, create_endpoint_config, and create_endpoint APIs
- •Available in us-east-1 and us-west-2 with per-hour billing and no minimum commitment
This summary was automatically generated by AI based on the original article and may not be fully accurate.