If you're bringing your own model, you don't need to use SageMaker. I've got mixtral-8x7b.Q5_K_M running on a g5.2xlarge (24GB of VRAM, 32 GB RAM), and it's $1.212 per hour in us-east-1. Just be sure to pick the Ubuntu deep learning AMI, as it has the required GPU drivers. Be aware that you're paying for the server whenever it's running - not just when you're interacting with the model. If you forget to shut it off and leave it for a month, that's $860... Answer from kingtheseus on reddit.com
🌐
Amazon Web Services
aws.amazon.com › machine learning › amazon sagemaker › pricing
SageMaker pricing - AWS
1 week ago - SageMaker Catalog follows a pay-as-you-go pricing model with no upfront commitments or minimum fees. Pricing is based on four key dimensions: requests, metadata storage, compute, and AI recommendations for generating business descriptions. Each account receives some free usage per billing month, including a set amount of metadata storage, API requests, and compute units.
🌐
Reddit
reddit.com › r/aws › how to do cost estimation for amazon sagemaker
r/aws on Reddit: How to do cost estimation for Amazon Sagemaker
March 15, 2024 -

Hey guys, I am trying to do some cost estimation in Sagemaker for an AI chatbot I am building. The AI chatbot will be using the Mixtral-8x7b-Instruct quantized model downloaded from Hugging Face. And I will be using Sagemaker endpoints for inference.

When I went into the AWS Pricing Calculator website (https://calculator.aws/#/) and selected Sagemaker, I was presented with different options to choose from like Sagemaker Studio Notebooks, RStudio on Sagemaker, SageMaker On-Demand Notebook Instances etc (see the link below).

https://imgur.com/a/KVSHINe

For my chatbot that I had described above, how would I know which of these options to select to do my pricing estimate?

Would really appreciate any help with this. Many thanks!

🌐
AWS re:Post
repost.aws › questions › QUTgC3561MQKaWwyazLh_YXQ › sagemaker-costs-for-ai-model
SageMaker costs for AI model | AWS re:Post
March 18, 2025 - Regarding monthly costs, if you keep the endpoint running 24/7 for a month, you could expect to pay around $876 (30 days * 24 hours * $1.212/hour), plus any additional charges for data transfer or storage.
🌐
CloudZero
cloudzero.com › home › blog › amazon sagemaker pricing guide: 2025 costs (and savings)
Amazon SageMaker Pricing Guide: 2025 Costs (And Savings)
August 15, 2025 - Although these options increase ... cost visibility and optimization efforts (complexity). Besides, SageMaker has some endpoints and service quotas you need to know about. Also, it’s challenging to choose the right SageMaker ML instance for your specific workload because instances vary in performance and ...
🌐
Reddit
reddit.com › r/aws › sagemaker rough costs when deploying object detection model api
r/aws on Reddit: SageMaker rough costs when deploying object detection model API
August 13, 2023 -

Hi all,

Use case - I've a mobile app where for roughly 30-45 seconds in total, the user sends every 2-3 seconds an image to AWS Gateway API, this image then is passed to pre-trained model and returns results. Initial idea was to retrieve real-time results, but for phase 1 we decided to simply allow user to focus on on the object and then send an image and retrieve result - in UI user is prompt with simple dialog (is it A object or not), so no real-time processing is needed for now.

The bill surprised me (SageMaker Inference using Endpoint):

$0.00 for Host:ml.m5.xlarge per hour under monthly free tier (125hrs)

$0.245 per Hosting ml.m5.xlarge hour in EU (Stockholm) (52.573 Hrs) = USD 12.88. If a little bit more than 2 days is costing me around 13 USD I have to be doing something awfully wrong. The app currently is being tested only by 2-3 people that have made in total around 300-400 endpoint calls.

I mean, yes, I could switch to ml.m6g.large that would cost me around $0.09 - how much people would that handle for my use-case? Nonetheless, even switching out to smaller machine seems to be costly as hell. Any suggestions? Maybe for my use case I need to use other service?

🌐
AWS
aws.amazon.com › blogs › machine-learning › part-5-analyze-amazon-sagemaker-spend-and-determine-cost-optimization-opportunities-based-on-usage-part-5-hosting
Analyze Amazon SageMaker spend and determine cost optimization opportunities based on usage, Part 5: Hosting | Artificial Intelligence
May 30, 2023 - The cost of SageMaker real-time ... the endpoint is running, the cost of GB-month of provisioned storage (EBS volume), as well as the GB data processed in and out of the endpoint instance, as outlined in Amazon SageMaker ...
🌐
Cloudchipr
cloudchipr.com › blog › amazon-sagemaker-pricing
Amazon SageMaker AI Pricing: Detailed Breakdown and Ultimate Guide
SageMaker Serverless Inference pricing is based on the compute capacity used (billed per millisecond) and the amount of data processed. Costs depend on the selected memory configuration, with an option to add Provisioned Concurrency for predictable performance. ... Provisioned Concurrency ensures predictable performance by keeping endpoints warm for concurrent requests.
Find elsewhere
🌐
Holori
holori.com › accueil › blog › ultimate aws sagemaker pricing guide
Holori - Ultimate AWS Sagemaker pricing guide
October 23, 2024 - Endpoints: Real-time inference can be costly if the endpoint is always running, even during low traffic periods. Network Costs: Moving data between AWS services (e.g., S3 to SageMaker) incurs egress fees. AWS SageMaker offers a variety of pricing models to accommodate different usage needs and budget constraints: SageMaker is available to try for free as part of the AWS Free Tier, offering new users two months ...
🌐
Amazon Web Services
aws.amazon.com › machine learning › amazon sagemaker ai › pricing
SageMaker Pricing
1 week ago - The total charges for this example would be $305.881 per month. Note, for built-in rules with ml.m5.xlarge instance, you get up to 30 hours of monitoring aggregated across all endpoints each month, at no charge. The model in example #5 is then deployed to production to two (2) ml.c5.xlarge ...
🌐
JFrog ML
qwak.com › post › the-hidden-cost-of-sagemaker
Uncovering The Hidden Costs Behind Sagemaker's Pricing | JFrog ML
On top of breaking down the different ... and the cost structure entailed, we'll take a look at the complexity that comes with this offering and how SageMaker's pricing can easily skyrocket accordingly. In the following use case, we'll construct a machine learning model capable of real-time operation. This model will undergo periodic training and subsequently be deployed as a real-time endpoint...
🌐
Cloudforecast
cloudforecast.io › home › aws pricing & cost optimization › aws sagemaker pricing guide – cost breakdown & optimization tips
AWS SageMaker Pricing Guide - Cost Breakdown & Optimization Tips | CloudForecast
July 28, 2025 - Confused by AWS SageMaker pricing? This guide breaks down costs, real-world examples, and expert tips to forecast and optimize your SageMaker bill.
🌐
TrustRadius
trustradius.com › home › ai development platforms › amazon sagemaker › pricing
Amazon SageMaker Pricing 2025
We recommend going to Amazon SageMaker’s pricing page to look at their associated costs for different instances, memory, and vCPU. The cost is determined by the type of instance, memory, and vCPU you use per hour, billed monthly. Below is an image example of the pricing list for Sagemaker ...
🌐
AWS re:Post
repost.aws › questions › QUaxWgMnwrSaq9IYZ5L0G1SA › sagemaker-real-time-inference-pricing-clarification
Sagemaker Real Time Inference pricing clarification | AWS re:Post
December 19, 2022 - In your case, you would be charged based on the per hour pricing of ml.c7g.16xlarge instance, and (2mb+2mb)*2mil for data in/out a month. Link to pricing examples can be found here.
🌐
Concurrencylabs
concurrencylabs.com › blog › sagemaker-ai-cost-savings
How To Keep SageMaker AI Cost Under Control and Avoid Bad Billing Surprises when doing Machine Learning in AWS - Concurrency Labs
This can significantly increase performance and lower cost for inference tasks. Configure SageMaker Studio instance automatic shut down after a pre-determined idle period of time. This is particularly important for organizations with a significant number of team members using SageMaker Studio. For inference endpoints, configuring Auto Scaling based on a schedule or usage metrics can optimize compute infrastructure cost.
🌐
Amazon Web Services
amazonaws.cn › en › sagemaker › pricing
SageMaker Pricing
1 week ago - Model Monitor provides a fully managed experience for running both built-in and custom rules as Amazon SageMaker Processing jobs. For built-in rules with ml.m5.xlarge instance in China (Ningxia) Region, you get up to 30 hours of monitoring aggregated across all endpoints each month, at no charge.
🌐
PeerSpot
peerspot.com › questions › what-is-your-experience-regarding-pricing-and-costs-for-amazon-sagemaker
What is your experience regarding pricing and costs for Amazon SageMaker?
The pricing is comparable. It is not very cheap. I rate the pricing an eight out of ten. The main reason why we're using it is because of its cost. We are aiming at keeping the costs at $100 per month.