1 week ago - As a data scientist, you spend 20 days using JupyterLab for quick experimentation on notebooks, code, and data for 6 hours per day on an ml.g4dn.xlarge instance. You create and then run a JupyterLab space to access the JupyterLab IDE. The compute is only charged for the instance used when the JupyterLab space is running.

instances.vantage.sh › aws › ec2 › g4dn.xlarge

g4dn.xlarge pricing and specs - Vantage

The g4dn.xlarge instance is in the GPU instance family with 4 vCPUs, 16 GiB of memory and up to 25 Gibps of bandwidth starting at $0.526 per hour.

Discussions

amazon web services - Which is lower cost, Sagemaker or EC2? - Stack Overflow

For example, ml.p2.8xlarge for training job at ap-northeast on Sagemaker takes 16.408 USD / hour, but p2.8xlarge for on-demand at ap-northeast on Ec2 takes 12.336 USD/hour. Is it cheap to just trai... More on stackoverflow.com

stackoverflow.com

Suggest instance type NVIDIA GPU has the best cost

My system is using the EC2 g4dn.xlarge instance type to run stable diffusion or train lora. Currently I want to find an NVIDIA GPU type version that costs less than g4dn.xlarge. Lower configuration... More on repost.aws

repost.aws

3

0

August 9, 2024

Benchmarking Inexpensive AWS Instances

Here's the data formatted in table for easier viewing: AWS Instances Instance Type | Cores | RAM | GPU | Prompt Eval Rate (tokens/s) | Eval Rate (tokens/s) | Price ($/hr) | Price ($/mo) | c7g.8xlarge | 32 | 64GB | - | 38.38 | 25.07 | 1.27 | 941.16 | r6g.4xlarge | 16 | 128GB | - | 10.15 | 8.29 | 0.88 | 657.10 | g4dn.xlarge | 4 | 16GB | 16GB | 222.23 | 41.71 | 0.58 | 434.50 | g4dn.2xlarge | 8 | 32GB | 32GB | 214.25 | 41.74 | 0.84 | 621.24 | g5.xlarge | 4 | 16GB | 24GB | 624.29 | 68.08 | 1.12 | 831.05 | g5.2xlarge | 8 | 32GB | 24GB | 624.48 | 66.67 | 1.35 | 1,000.96 Vs Local Machines Machine Type | Cores | RAM | GPU | Prompt Eval Rate (tokens/s) | Eval Rate (tokens/s) | M2 MacMini | - | 8GB | <8GB | 66.38 | 18.33 | M1 MacBook Air | - | 16GB | <16GB | 71.58 | 11.46 | Home PC w/RTX 3080 | - | 64GB | 10GB | 185.67 | 83.79 More on reddit.com

r/LocalLLaMA

21

23

June 10, 2024

Sagemaker pricing and spot instances

I ask this as it seems surprising there's ~40% markup in cost for using this managed service versus EC2, and despite what the AWS Report on TCO says, I can't quite see it saving me that amount of money versus us setting up a EKS/ECS solution for this problem. Yes but I am still using SageMaker async inference because it has benefits if you are mostly serverless. No need to deal with VPC, AMI, OS and so on AWS manages and builds a the async queue for you and distributes work You get autoscaling for free and can even scale to zero My BE is serverless (Lambda, S3, ...) and thanks to the SageMaker async inference it stays "serverless" because I don't have to deal with EC2/EKS/ECS + Autoscaling + SQS + VPC. More on reddit.com

r/aws

1

3

April 8, 2024

Amazon Web Services

aws.amazon.com › machine learning › amazon sagemaker › pricing

SageMaker pricing - AWS

1 week ago - The key pricing dimensions for SageMaker AI include instance usage (compute resources used in training, hosting, and notebook instances), storage (Amazon SageMaker notebooks, Amazon Elastic Block Store (Amazon EBS) volumes, and Amazon S3), data processing jobs, model deployment, and MLOps (Amazon SageMaker Pipelines and Model Monitor).

Cloudzero

advisor.cloudzero.com › aws › sagemaker › ml.g4dn.xlarge

ml.g4dn.xlarge SageMaker ML Instance Specs And Pricing

October 9, 2025 - CloudZero's intelligent platform helps you optimize cloud costs and improve infrastructure efficiency.

Saturn Cloud

saturncloud.io › sagemaker-pricing

Amazon SageMaker Pricing | Saturn Cloud

A Comparison of Saturn Cloud's pricing VS Amazon SageMaker's Pricing

EC2 Pricing Calculator

costcalc.cloudoptimo.com › aws-pricing-calculator › ec2 › g4dn.xlarge

g4dn.xlarge Pricing and Specs: AWS EC2

The g4dn.xlarge instance is part of the g4dn series, featuring 4 vCPUs and Up to 25 Gigabit of RAM, with Gpu Instances. It is available at a rate of $0.5260/hour. Price / HourN.

AWS re:Post

repost.aws › articles › AR864QSGIIRlCmxnaNRzeouw › being-mindful-of-your-spend-while-experimenting-with-ml

Being mindful of your spend while experimenting with ML. | AWS re:Post

February 7, 2024 - This instance type costs only $0.05/hour, however ml.g4dn.xlarge is $0.7364/hour or about $120 a week. The unnecessary charges can be simply avoided by stopping the instance when you’re done with your work.

CloudPrice

cloudprice.net › amazon web services › ec2 › g4dn.xlarge

g4dn.xlarge specs and pricing | AWS | CloudPrice

October 24, 2025 - Amazon EC2 instance g4dn.xlarge with 4 vCPUs, 16 GiB RAM and 1 x NVIDIA T4 16 GiB. Available in 23 regions starting from $383.98 per month.

Vantage

instances.vantage.sh › aws › ec2 › g4dn.2xlarge

g4dn.2xlarge pricing and specs - Vantage

The g4dn.2xlarge instance is in the GPU instance family with 8 vCPUs, 32 GiB of memory and up to 25 Gibps of bandwidth starting at $0.752 per hour.

Find elsewhere

Google Bing Mojeek

Amazon Web Services

amazonaws.cn › en › sagemaker › pricing

SageMaker Pricing

1 week ago - For built-in rules, there is no charge and Amazon SageMaker Debugger will automatically select an instance type. For custom rules, you will need to choose an instance (e.g. ml.m5.xlarge) and you will be charged for the duration for which the instance is in use for the Amazon SageMaker Processing ...

Cloudchipr

cloudchipr.com › blog › amazon-sagemaker-pricing

Amazon SageMaker AI Pricing: Detailed Breakdown and Ultimate Guide

For example, if you start with a ml.c5.xlarge CPU instance in US East (Ohio) and later switch to a ml.inf1 GPU instance in US West (Oregon) for inference tasks, your Savings Plans will continue to apply the discounted rate automatically.

Vantage

instances.vantage.sh › aws › ec2 › g4dn.8xlarge

g4dn.8xlarge pricing and specs - Vantage

The g4dn.8xlarge instance is in the GPU instance family with 32 vCPUs, 128 GiB of memory and 50 Gibps of bandwidth starting at $2.176 per hour.

Vantage

instances.vantage.sh › aws › ec2 › g4dn.12xlarge

g4dn.12xlarge pricing and specs - Vantage

The g4dn.12xlarge instance is in the GPU instance family with 48 vCPUs, 192 GiB of memory and 50 Gibps of bandwidth starting at $3.912 per hour.

Amazon Web Services

pages.awscloud.com › rs › 112-TZM-766 › images › AL-ML for Startups - Select the Right ML Instance.pdf pdf

Select the right ML instance for your training and inference ...

We cannot provide a description for this page right now

CloudZero

cloudzero.com › home › blog › amazon sagemaker pricing guide: 2025 costs (and savings)

Amazon SageMaker Pricing Guide: 2025 Costs (And Savings)

August 15, 2025 - SageMaker RStudio on SageMaker – 250 hours usage on ml.t3.medium instances for the RSession app, plus free usage of ml.t3.medium instance for the RStudioServerPro app. SageMaker Real-time inference – 125 hours usage on m4.xlarge or m5.xlarge instances.

Stack Overflow

stackoverflow.com › questions › 52198660 › which-is-lower-cost-sagemaker-or-ec2

amazon web services - Which is lower cost, Sagemaker or EC2? - Stack Overflow

Top answer

1 of 5

41

I've listed some pros and cons from experience...

..., as opposed to marketing materials. If I were to guess, I'd say you have a much higher chance to experience all the drawbacks of SageMaker, than any one of the benefits.

Drawbacks

Cloud vendor lock in: free improvements in the open source projects in the future and better prices in competitor vendors are difficult to get. Why don't AWS invest developers in JupyterLab, they have done limited work in open source. Find some great points here, where people have experienced companies using as few AWS services as possible with good effect.
SageMaker instances are currently 40% more expensive than their EC2 equivalent.
Slow startup, it will break your workflow if every time you start the machine, it takes ~5 minutes. SageMaker Studio apparently speeds this up, but not without other issues. This is completely unacceptable when you are trying to code or run applications.
SageMaker Studio is the first thing they show you when you enter SageMaker console. It should really be the last thing you consider.
- SageMaker Studio is more limited than SageMaker notebook instances. For example, you cannot mount an EFS drive.I spoke to a AWS solutions architect, and he confirmed this was impossible (after looking for the answer all over the internet). It is also very new, so there is almost no support on it, even by AWS developers.
Worsens the disorganised Notebooks problem. Notebooks in a file system can be much easier to organise than using JupyterLab. With SageMaker Studio, a new volume gets created and your notebooks lives in there. What happens when you have more than 1...
Awful/ limited terminal experience, coupled with tedious configuration (via Lifecycle configuration scripts, which require the Notebook to be turned off just to edit these scripts). Additionally, you cannot set any lifecycle configurations for Studio Notebooks.
SageMaker endpoints are limited compared to running your own server in an EC2 instance.
It may seem like it allows you to skip certain challenges, but in fact it provides you with more obscure challenges that no one has solved. Good luck solving them. The rigidity of SageMaker and lack of documentation means lots of workarounds and pain. This is very expensive.

Benefits

These revolve around the SageMaker SDK (the Sagemaker console and SageMaker SDK) (please comment or edit if you found any more benefits)

Built in algorithms (which you can easily just import in your machine learning framework of choice): I would say this is worse than using open source alternatives.
Training many models easily during hyperparameter search YouTube video by AWS (a fast way to spend money)
Easily create machine learning related AWS mechanical turk tasks. However, mturk is very limited within SageMaker, so youre better off going to mturk yourself.

My suggestion

If you're thinking about ML on the cloud, don't use SageMaker. Spin up a VM with a prebuilt image that has PyTorch/ TensorFlow and JupyterLab and get the work done.

2 of 5

17

You are correct about EC2 being cheaper than Sagemaker. However you have to understand their differences.

EC2 provides you computing power
Sagemaker (try to) provides a fully configured environment and computing power with a seamless deployment model for you to start training your model on day one

If you look at Sagemaker's overview page, it comes with Jupyter notebooks, pre-installed machine learning algorithms, optimized performance, seamless rollout to production etc.

Note that this is the same as self-hosting a EC2 MYSQL server and utilizing AWS managed RDS MYSQL. Managed services always appears to be more expensive, but if you factor in the time you have to spent maintaing server, updating packages etc., the extra 30% cost may be worth it.

So in conclusion if you rather save some money and have the time to set up your own server or environment, go for EC2. If you do not want to be bothered with these work and want to start training as soon as possible, use Sagemaker.

Concurrencylabs

concurrencylabs.com › blog › sagemaker-ai-cost-savings

How To Keep SageMaker AI Cost Under Control and Avoid Bad Billing Surprises when doing Machine Learning in AWS - Concurrency Labs

December 4, 2024 - Depending on the number of instances and allocated storage, this price dimension can result in significant cost; especially considering that ML processes typically access large amounts of data that often need to be available in local storage for performance reasons.

AWS re:Post

repost.aws › questions › QUMskv-5YRT_GvGmNYgV0eAw › suggest-instance-type-nvidia-gpu-has-the-best-cost

Suggest instance type NVIDIA GPU has the best cost | AWS re:Post

Top answer

1 of 3

2

Hello. The following document describes the GPU-equipped instance types that can be used with EC2. https://docs.aws.amazon.com/dlami/latest/devguide/gpu.html I checked the prices in the price list below, and I thought that "g4dn.xlarge" was the cheapest if you were running it on demand. https://aws.amazon.com/ec2/pricing/on-demand/?nc1=h_ls

2 of 3

0

The g4dn is among the lowest cost GPU-based instances. If your software supports Graviton arm64, you can explore g5g instance. It is available in Regions such as Oregon us-west-2.

reddit.com › r/localllama › benchmarking inexpensive aws instances

r/LocalLLaMA on Reddit: Benchmarking Inexpensive AWS Instances

June 10, 2024 -

I recently did some testing using Dolphin-Llama3 across various (inexpensive-ish) AWS instances to compare performance. The results are in line with what one might expect.

Testing was done using default settings with Ollama. I spun up a new instance on Ubuntu, installed Ollama and ran it with Dolphin-Llama3 —verbose.

Key Takeaways:

-Fastest Prompt Eval Rate: AWS g5 (fastest AWS instance tested)
-Fastest Eval Rate: Home PC w/RTX 3080
-Best Cost-Performance Balance: AWS g4dn.xlarge offers a good balance of performance and cost, at $0.58/hr.
-GPU speed is the key differentiator. Within the same family of models, such as the g4dn and g5 instances, the evaluation rates remain consistent. If the model fits in GPU memory there is no need for more cores/memory.
-I did notice that the more system memory available the greater number of tokens used in the output.

Test Results

AWS Instances

c7g.8xlarge (Compute Instance)
•32 cores, 64GB RAM
•Prompt Eval Rate: 38.38 tokens/s
•Eval Rate: 25.07 tokens/s
•Price: $1.27/hr, $941.16/mo
r6g.4xlarge (Memory Instance)
•16 cores, 128GB RAM
•Prompt Eval Rate: 10.15 tokens/s
•Eval Rate: 8.29 tokens/s
•Price: $0.88/hr, $657.10/mo
g4dn.xlarge (GPU Instance)
•4 cores, 16GB RAM, 16GB GPU
•Prompt Eval Rate: 222.23 tokens/s
•Eval Rate: 41.71 tokens/s
•Price: $0.58/hr, $434.50/mo
g4dn.2xlarge (GPU Instance)
•8 cores, 32GB RAM, 32GB GPU
•Prompt Eval Rate: 214.25 tokens/s
•Eval Rate: 41.74 tokens/s
•Price: $0.84/hr, $621.24/mo
g5.xlarge (GPU Instance)
•4 cores, 16GB RAM, 24GB GPU
•Prompt Eval Rate: 624.29 tokens/s
•Eval Rate: 68.08 tokens/s
•Price: $1.12/hr, $831.05/mo
g5.2xlarge (GPU Instance)
•8 cores, 32GB RAM, 24GB GPU
•Prompt Eval Rate: 624.48 tokens/s
•Eval Rate: 66.67 tokens/s
•Price: $1.35/hr, $1,000.96/mo

Local Machines

M2 MacMini
•M2, 8GB RAM, <8GB GPU
•Prompt Eval Rate: 66.38 tokens/s
•Eval Rate: 18.33 tokens/s
M1 MacBook Air
•M1, 16GB RAM, <16GB GPU
•Prompt Eval Rate: 71.58 tokens/s
•Eval Rate: 11.46 tokens/s
Home PC w/RTX 3080
•Intel i5, 64GB RAM, 10GB GPU
•Prompt Eval Rate: 185.67 tokens/s
•Eval Rate: 83.79 tokens/s

Oracle Ampere

Ampere 16 Core, 32GB RAM
•Prompt Eval Rate: 11.96 tokens/s (Duration: 1m34.955180835s)
•Eval Rate: 9.01 tokens/s (Duration: 1m28.461256s)
•Price: $0.1276/hr, $95/mo
Ampere 32 Core, 32GB RAM
•Prompt Eval Rate: 22.54 tokens/s (Duration: 47.93207936s)
•Eval Rate: 14.11 tokens/s (Duration: 44.423782s)
•Price: $0.2796/hr, $208/mo

Here's the data formatted in table for easier viewing - courtesy of u/sergeant113. https://www.reddit.com/r/LocalLLaMA/comments/1dclmwt/comment/l7zrgzm/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

Top answer

1 of 5

10

Here's the data formatted in table for easier viewing: AWS Instances Instance Type | Cores | RAM | GPU | Prompt Eval Rate (tokens/s) | Eval Rate (tokens/s) | Price ($/hr) | Price ($/mo) | c7g.8xlarge | 32 | 64GB | - | 38.38 | 25.07 | 1.27 | 941.16 | r6g.4xlarge | 16 | 128GB | - | 10.15 | 8.29 | 0.88 | 657.10 | g4dn.xlarge | 4 | 16GB | 16GB | 222.23 | 41.71 | 0.58 | 434.50 | g4dn.2xlarge | 8 | 32GB | 32GB | 214.25 | 41.74 | 0.84 | 621.24 | g5.xlarge | 4 | 16GB | 24GB | 624.29 | 68.08 | 1.12 | 831.05 | g5.2xlarge | 8 | 32GB | 24GB | 624.48 | 66.67 | 1.35 | 1,000.96 Vs Local Machines Machine Type | Cores | RAM | GPU | Prompt Eval Rate (tokens/s) | Eval Rate (tokens/s) | M2 MacMini | - | 8GB | <8GB | 66.38 | 18.33 | M1 MacBook Air | - | 16GB | <16GB | 71.58 | 11.46 | Home PC w/RTX 3080 | - | 64GB | 10GB | 185.67 | 83.79

2 of 5

6

I use the Hetzner 16-core 32GB ARM instances when I need something cheap and dirty. The cost is 1/10th of even the cheapest AWS setup you have here, so if you aren't worried about speed that might be the way to go.

reddit.com › r/aws › sagemaker pricing and spot instances

r/aws on Reddit: Sagemaker pricing and spot instances

April 8, 2024 -

I've been looking into whether it would be sensible to use AWS Sagemaker for my GPU inference workload on a T4 GPU. It can tolerate multiple minutes of downtime and delays, and has large binary multimedia files as input to the model. Autoscaling is a requirement based on load.

AWS Sagemaker Asynchronous endpoints look very attractive in that they manage the container orchestration, autoscaling, queueing of requests and provide various convenience utilities for maintaining, monitoring and upgrading models in production.

In us-east-1, I calculate the following:

Service	Instance Type	Hourly Price	Monthly Price
EC2 On Demand	g4dn.xlarge	$0.526	$378
EC2 Annual Reservation	g4dn.xlarge	$0.309 (upfront)	$229
EC2 Spot	g4dn.xlarge	~$0.22	~$156
Sagemaker On Demand	ml.g4dn.xlarge	$0.7364	$530
Sagemaker On Demand with Saving Plan	ml.g4dn.xlarge	$0.4984	$358

From what I can see however, it would come with a significant cost (likely prohibitively high) to use Sagemaker versus using EKS/ECS to host the model and SQS to provide the queueing of requests. I appreciate that's the price one pays for a managed service, but I wanted to confirm a few things with the community to make sure I'm not missing anything in my cost estimations:

Is it correct that Sagemaker does not support spot instances for inference at all? (I appreciate they support it for training)
Is it correct that one can apply a savings plan to inference endpoints and that it would be Service classified as "Hosting" on this page https://aws.amazon.com/savingsplans/ml-pricing/ ? It's confusing as "Hosting Service" is not a term they use in the development docs to describe inference endpoints per say.
Is it correct one cannot reserve instances for a year for Sagemaker like with EC2 to cut costs, and thus the above Savings Plan is the cheapest you can get a T4 GPU.

I ask this as it seems surprising there's ~40% markup in cost for using this managed service versus EC2, and despite what the AWS Report on TCO says, I can't quite see it saving me that amount of money versus us setting up a EKS/ECS solution for this problem. I can see however that TCO report is also largely considering training infra, which indeed does likely bring a lot of value not relevant here.

Top answer

1 of 1

1

I ask this as it seems surprising there's ~40% markup in cost for using this managed service versus EC2, and despite what the AWS Report on TCO says, I can't quite see it saving me that amount of money versus us setting up a EKS/ECS solution for this problem. Yes but I am still using SageMaker async inference because it has benefits if you are mostly serverless. No need to deal with VPC, AMI, OS and so on AWS manages and builds a the async queue for you and distributes work You get autoscaling for free and can even scale to zero My BE is serverless (Lambda, S3, ...) and thanks to the SageMaker async inference it stays "serverless" because I don't have to deal with EC2/EKS/ECS + Autoscaling + SQS + VPC.