Deploy SageMaker AI inference endpoints with set GPU capacity using training plans

By Kareem Syed-Mohammed / March 24, 2026

In this post, we walk through how to search for available p-family GPU capacity, create a training plan reservation for inference, and deploy a SageMaker AI inference endpoint on that reserved capacity. We follow a data scientist's journey as they reserve capacity for model evaluation and manage the endpoint throughout the reservation lifecycle.

Leave a Comment