If you're bringing your own model, you don't need to use SageMaker. I've got mixtral-8x7b.Q5_K_M running on a g5.2xlarge (24GB of VRAM, 32 GB RAM), and it's $1.212 per hour in us-east-1. Just be sure to pick the Ubuntu deep learning AMI, as it has the required GPU drivers. Be aware that you're paying for the server whenever it's running - not just when you're interacting with the model. If you forget to shut it off and leave it for a month, that's $860... Answer from kingtheseus on reddit.com
🌐
AWS re:Post
repost.aws › questions › QUTgC3561MQKaWwyazLh_YXQ › sagemaker-costs-for-ai-model
SageMaker costs for AI model | AWS re:Post
March 18, 2025 - Regarding monthly costs, if you keep the endpoint running 24/7 for a month, you could expect to pay around $876 (30 days * 24 hours * $1.212/hour), plus any additional charges for data transfer or storage.
🌐
Reddit
reddit.com › r/aws › how to do cost estimation for amazon sagemaker
r/aws on Reddit: How to do cost estimation for Amazon Sagemaker
March 15, 2024 -

Hey guys, I am trying to do some cost estimation in Sagemaker for an AI chatbot I am building. The AI chatbot will be using the Mixtral-8x7b-Instruct quantized model downloaded from Hugging Face. And I will be using Sagemaker endpoints for inference.

When I went into the AWS Pricing Calculator website (https://calculator.aws/#/) and selected Sagemaker, I was presented with different options to choose from like Sagemaker Studio Notebooks, RStudio on Sagemaker, SageMaker On-Demand Notebook Instances etc (see the link below).

https://imgur.com/a/KVSHINe

For my chatbot that I had described above, how would I know which of these options to select to do my pricing estimate?

Would really appreciate any help with this. Many thanks!

🌐
Amazon Web Services
aws.amazon.com › machine learning › amazon sagemaker › pricing
SageMaker pricing - AWS
1 week ago - SageMaker Catalog follows a pay-as-you-go pricing model with no upfront commitments or minimum fees. Pricing is based on four key dimensions: requests, metadata storage, compute, and AI recommendations for generating business descriptions. Each account receives some free usage per billing month, including a set amount of metadata storage, API requests, and compute units.
🌐
CloudZero
cloudzero.com › home › blog › amazon sagemaker pricing guide: 2025 costs (and savings)
Amazon SageMaker Pricing Guide: 2025 Costs (And Savings)
August 15, 2025 - Although these options increase ... cost visibility and optimization efforts (complexity). Besides, SageMaker has some endpoints and service quotas you need to know about. Also, it’s challenging to choose the right SageMaker ML instance for your specific workload because instances vary in performance and ...
🌐
Amazon Web Services
aws.amazon.com › machine learning › amazon sagemaker ai › pricing
SageMaker Pricing
1 week ago - The total charges for this example would be $305.881 per month. Note, for built-in rules with ml.m5.xlarge instance, you get up to 30 hours of monitoring aggregated across all endpoints each month, at no charge. The model in example #5 is then deployed to production to two (2) ml.c5.xlarge ...
🌐
Reddit
reddit.com › r/aws › sagemaker rough costs when deploying object detection model api
r/aws on Reddit: SageMaker rough costs when deploying object detection model API
August 14, 2023 -

Hi all,

Use case - I've a mobile app where for roughly 30-45 seconds in total, the user sends every 2-3 seconds an image to AWS Gateway API, this image then is passed to pre-trained model and returns results. Initial idea was to retrieve real-time results, but for phase 1 we decided to simply allow user to focus on on the object and then send an image and retrieve result - in UI user is prompt with simple dialog (is it A object or not), so no real-time processing is needed for now.

The bill surprised me (SageMaker Inference using Endpoint):

$0.00 for Host:ml.m5.xlarge per hour under monthly free tier (125hrs)

$0.245 per Hosting ml.m5.xlarge hour in EU (Stockholm) (52.573 Hrs) = USD 12.88. If a little bit more than 2 days is costing me around 13 USD I have to be doing something awfully wrong. The app currently is being tested only by 2-3 people that have made in total around 300-400 endpoint calls.

I mean, yes, I could switch to ml.m6g.large that would cost me around $0.09 - how much people would that handle for my use-case? Nonetheless, even switching out to smaller machine seems to be costly as hell. Any suggestions? Maybe for my use case I need to use other service?

🌐
Cloudforecast
cloudforecast.io › home › aws pricing & cost optimization › aws sagemaker pricing guide – cost breakdown & optimization tips
AWS SageMaker Pricing Guide - Cost Breakdown & Optimization Tips | CloudForecast
July 28, 2025 - Confused by AWS SageMaker pricing? This guide breaks down costs, real-world examples, and expert tips to forecast and optimize your SageMaker bill.
🌐
AWS
aws.amazon.com › blogs › machine-learning › part-5-analyze-amazon-sagemaker-spend-and-determine-cost-optimization-opportunities-based-on-usage-part-5-hosting
Analyze Amazon SageMaker spend and determine cost optimization opportunities based on usage, Part 5: Hosting | Artificial Intelligence
May 30, 2023 - The cost of SageMaker real-time ... the endpoint is running, the cost of GB-month of provisioned storage (EBS volume), as well as the GB data processed in and out of the endpoint instance, as outlined in Amazon SageMaker ...
Find elsewhere
🌐
Cloudchipr
cloudchipr.com › blog › amazon-sagemaker-pricing
Amazon SageMaker AI Pricing: Detailed Breakdown and Ultimate Guide
SageMaker Serverless Inference pricing is based on the compute capacity used (billed per millisecond) and the amount of data processed. Costs depend on the selected memory configuration, with an option to add Provisioned Concurrency for predictable performance. ... Provisioned Concurrency ensures predictable performance by keeping endpoints warm for concurrent requests.
🌐
TrustRadius
trustradius.com › home › ai development platforms › amazon sagemaker › pricing
Amazon SageMaker Pricing 2025
We recommend going to Amazon SageMaker’s pricing page to look at their associated costs for different instances, memory, and vCPU. The cost is determined by the type of instance, memory, and vCPU you use per hour, billed monthly. Below is an image example of the pricing list for Sagemaker ...
🌐
Concurrencylabs
concurrencylabs.com › blog › sagemaker-ai-cost-savings
How To Keep SageMaker AI Cost Under Control and Avoid Bad Billing Surprises when doing Machine Learning in AWS - Concurrency Labs
This can significantly increase performance and lower cost for inference tasks. Configure SageMaker Studio instance automatic shut down after a pre-determined idle period of time. This is particularly important for organizations with a significant number of team members using SageMaker Studio. For inference endpoints, configuring Auto Scaling based on a schedule or usage metrics can optimize compute infrastructure cost.
🌐
Holori
holori.com › accueil › blog › ultimate aws sagemaker pricing guide
Holori - Ultimate AWS Sagemaker pricing guide
October 23, 2024 - Endpoints: Real-time inference can be costly if the endpoint is always running, even during low traffic periods. Network Costs: Moving data between AWS services (e.g., S3 to SageMaker) incurs egress fees. AWS SageMaker offers a variety of pricing models to accommodate different usage needs and budget constraints: SageMaker is available to try for free as part of the AWS Free Tier, offering new users two months ...
🌐
JFrog ML
qwak.com › post › the-hidden-cost-of-sagemaker
Uncovering The Hidden Costs Behind Sagemaker's Pricing | JFrog ML
On top of breaking down the different ... and the cost structure entailed, we'll take a look at the complexity that comes with this offering and how SageMaker's pricing can easily skyrocket accordingly. In the following use case, we'll construct a machine learning model capable of real-time operation. This model will undergo periodic training and subsequently be deployed as a real-time endpoint...
🌐
AWS re:Post
repost.aws › questions › QUaxWgMnwrSaq9IYZ5L0G1SA › sagemaker-real-time-inference-pricing-clarification
Sagemaker Real Time Inference pricing clarification | AWS re:Post
December 19, 2022 - In your case, you would be charged based on the per hour pricing of ml.c7g.16xlarge instance, and (2mb+2mb)*2mil for data in/out a month. Link to pricing examples can be found here.
🌐
nOps
nops.io › blog › sagemaker-pricing-the-essential-guide
SageMaker Pricing: The Essential Guide | nOps
November 18, 2025 - Amazon SageMaker is one of the most popular managed machine learning services, enabling teams to build, train, and deploy models at scale. But while SageMaker simplifies the ML workflow, understanding its pricing can be tricky. Costs vary depending on instance types, training jobs, endpoints, and storage, making it essential to know how SageMaker pricing works […]
🌐
TutorialsPoint
tutorialspoint.com › sagemaker › sagemaker-pricing.htm
Amazon SageMaker - Pricing
The billing is calculated on an hourly basis for each endpoint. Amazon SageMaker is dependent on Amazon S3 for storing datasets. You will be charged for data storage in S3, as well as any data transfers between S3 and Amazon SageMaker. These costs depend on the size of the data being used and the amount of data transferred in and out of the cloud. Here are some ways with the help of which you can manage and reduce costs when using Amazon SageMaker −