The g5.xlarge instance is in the GPU instance family with 4 vCPUs, 16 GiB of memory and up to 10 Gibps of bandwidth starting at $1.006 per hour.

aws.amazon.com › amazon ec2 › instance types › g5 instances

Amazon EC2 G5 Instances | Amazon Web Services

1 week ago - G5 instances deliver up to 3x higher performance and up to 40% better price performance for machine learning inference compared to G4dn instances. They are a highly performant and cost-efficient solution for customers who want to use NVIDIA libraries such as TensorRT, CUDA, and cuDNN to run their ML applications.

Discussions

Benchmarking Inexpensive AWS Instances

Here's the data formatted in table for easier viewing: AWS Instances Instance Type | Cores | RAM | GPU | Prompt Eval Rate (tokens/s) | Eval Rate (tokens/s) | Price ($/hr) | Price ($/mo) | c7g.8xlarge | 32 | 64GB | - | 38.38 | 25.07 | 1.27 | 941.16 | r6g.4xlarge | 16 | 128GB | - | 10.15 | 8.29 | 0.88 | 657.10 | g4dn.xlarge | 4 | 16GB | 16GB | 222.23 | 41.71 | 0.58 | 434.50 | g4dn.2xlarge | 8 | 32GB | 32GB | 214.25 | 41.74 | 0.84 | 621.24 | g5.xlarge | 4 | 16GB | 24GB | 624.29 | 68.08 | 1.12 | 831.05 | g5.2xlarge | 8 | 32GB | 24GB | 624.48 | 66.67 | 1.35 | 1,000.96 Vs Local Machines Machine Type | Cores | RAM | GPU | Prompt Eval Rate (tokens/s) | Eval Rate (tokens/s) | M2 MacMini | - | 8GB | <8GB | 66.38 | 18.33 | M1 MacBook Air | - | 16GB | <16GB | 71.58 | 11.46 | Home PC w/RTX 3080 | - | 64GB | 10GB | 185.67 | 83.79 More on reddit.com

r/LocalLLaMA

21

23

June 10, 2024

SageMaker costs for AI model

For your deployment using an ml.g5.xlarge instance, you will be charged based on the time your endpoint is running, regardless of whether it's processing requests or idle. More on repost.aws

repost.aws

2

0

March 18, 2025

g5.xlarge instances not available on eu-central-1a all of the sudden, any cause?

GPU instances are in high demand so try again in a while. Also head the warning do not specify an AZ. That should allow your ASG to select any AZ where there is availability. Can you also use a different GPU or do you specifically need this one ? More on reddit.com

r/aws

9

1

June 2, 2023

Nvidia driver for g5.xlarge

Please see the driver installation for G5DN on Amazon’s site here Gaming drivers (I assume you needed GRID because you are doing ML) are available further down on the page. More on reddit.com

r/aws

10

0

February 24, 2024

Amazon Web Services

aws.amazon.com › machine learning › amazon sagemaker ai › pricing

SageMaker Pricing

1 week ago - The sub-total for 3,100 MB of data processed In and 310 MB of data processed Out for Hosting per month = $0.06. The total charges for this example would be $305.887 per month. Let's say you wanted to provision a cluster of 4 ml.g5.24xlarge for 1 month (30 days) with an additional 100 GB of ...

Cloudzero

advisor.cloudzero.com › aws › sagemaker › ml.g5.xlarge

ml.g5.xlarge SageMaker ML Instance Specs And Pricing

October 11, 2025 - CloudZero's intelligent platform helps you optimize cloud costs and improve infrastructure efficiency.

Vantage

instances.vantage.sh › aws › ec2 › g5.2xlarge

g5.2xlarge pricing and specs - Vantage

The g5.2xlarge instance is in the GPU instance family with 8 vCPUs, 32 GiB of memory and up to 10 Gibps of bandwidth starting at $1.212 per hour.

Amazon Web Services

amazonaws.cn › en › sagemaker › pricing

SageMaker Pricing

1 week ago - For built-in rules, there is no charge and Amazon SageMaker Debugger will automatically select an instance type. For custom rules, you will need to choose an instance (e.g. ml.m5.xlarge) and you will be charged for the duration for which the instance is in use for the Amazon SageMaker Processing ...

Cloudzero

advisor.cloudzero.com › aws › sagemaker › ml.g5.48xlarge

ml.g5.48xlarge SageMaker ML Instance Specs And Pricing

October 11, 2025 - CloudZero's intelligent platform helps you optimize cloud costs and improve infrastructure efficiency.

Economize

economize.cloud › resources › aws › pricing › ec2 › g5.xlarge

g5.xlarge pricing: $734.38 monthly - AWS EC2

1 week ago - The g5.xlarge instance is in the g5 family with 4 vCPUs and 16 GiB of memory, priced from $1.01/hr or $734.38/mo.

Vantage

instances.vantage.sh › aws › ec2 › g5.4xlarge

g5.4xlarge pricing and specs - Vantage

The g5.4xlarge instance is in the GPU instance family with 16 vCPUs, 64 GiB of memory and up to 25 Gibps of bandwidth starting at $1.624 per hour.

Find elsewhere

Google Bing Mojeek

Cloudforecast

cloudforecast.io › home › aws pricing & cost optimization › aws sagemaker pricing guide – cost breakdown & optimization tips

AWS SageMaker Pricing Guide - Cost Breakdown & Optimization Tips | CloudForecast

October 29, 2025 - For example, if you initially pick a ml.p3.2xlarge ($3.825/hour) for training, but see that GPU utilization is consistently below 30%, you might decide to switch to the ml.g5.2xlarge ($1.515/hour) instead.

AWS

aws.amazon.com › amazon ec2 › pricing › on-demand pricing

EC2 On-Demand Instance Pricing

1 week ago - On-Demand Instances let you pay for compute capacity by the hour or second (minimum of 60 seconds) with no long-term commitments. This frees you from the costs and complexities of planning, purchasing, and maintaining hardware and transforms what are commonly large fixed costs into much smaller ...

reddit.com › r/localllama › benchmarking inexpensive aws instances

r/LocalLLaMA on Reddit: Benchmarking Inexpensive AWS Instances

June 10, 2024 -

I recently did some testing using Dolphin-Llama3 across various (inexpensive-ish) AWS instances to compare performance. The results are in line with what one might expect.

Testing was done using default settings with Ollama. I spun up a new instance on Ubuntu, installed Ollama and ran it with Dolphin-Llama3 —verbose.

Key Takeaways:

-Fastest Prompt Eval Rate: AWS g5 (fastest AWS instance tested)
-Fastest Eval Rate: Home PC w/RTX 3080
-Best Cost-Performance Balance: AWS g4dn.xlarge offers a good balance of performance and cost, at $0.58/hr.
-GPU speed is the key differentiator. Within the same family of models, such as the g4dn and g5 instances, the evaluation rates remain consistent. If the model fits in GPU memory there is no need for more cores/memory.
-I did notice that the more system memory available the greater number of tokens used in the output.

Test Results

AWS Instances

c7g.8xlarge (Compute Instance)
•32 cores, 64GB RAM
•Prompt Eval Rate: 38.38 tokens/s
•Eval Rate: 25.07 tokens/s
•Price: $1.27/hr, $941.16/mo
r6g.4xlarge (Memory Instance)
•16 cores, 128GB RAM
•Prompt Eval Rate: 10.15 tokens/s
•Eval Rate: 8.29 tokens/s
•Price: $0.88/hr, $657.10/mo
g4dn.xlarge (GPU Instance)
•4 cores, 16GB RAM, 16GB GPU
•Prompt Eval Rate: 222.23 tokens/s
•Eval Rate: 41.71 tokens/s
•Price: $0.58/hr, $434.50/mo
g4dn.2xlarge (GPU Instance)
•8 cores, 32GB RAM, 32GB GPU
•Prompt Eval Rate: 214.25 tokens/s
•Eval Rate: 41.74 tokens/s
•Price: $0.84/hr, $621.24/mo
g5.xlarge (GPU Instance)
•4 cores, 16GB RAM, 24GB GPU
•Prompt Eval Rate: 624.29 tokens/s
•Eval Rate: 68.08 tokens/s
•Price: $1.12/hr, $831.05/mo
g5.2xlarge (GPU Instance)
•8 cores, 32GB RAM, 24GB GPU
•Prompt Eval Rate: 624.48 tokens/s
•Eval Rate: 66.67 tokens/s
•Price: $1.35/hr, $1,000.96/mo

Local Machines

M2 MacMini
•M2, 8GB RAM, <8GB GPU
•Prompt Eval Rate: 66.38 tokens/s
•Eval Rate: 18.33 tokens/s
M1 MacBook Air
•M1, 16GB RAM, <16GB GPU
•Prompt Eval Rate: 71.58 tokens/s
•Eval Rate: 11.46 tokens/s
Home PC w/RTX 3080
•Intel i5, 64GB RAM, 10GB GPU
•Prompt Eval Rate: 185.67 tokens/s
•Eval Rate: 83.79 tokens/s

Oracle Ampere

Ampere 16 Core, 32GB RAM
•Prompt Eval Rate: 11.96 tokens/s (Duration: 1m34.955180835s)
•Eval Rate: 9.01 tokens/s (Duration: 1m28.461256s)
•Price: $0.1276/hr, $95/mo
Ampere 32 Core, 32GB RAM
•Prompt Eval Rate: 22.54 tokens/s (Duration: 47.93207936s)
•Eval Rate: 14.11 tokens/s (Duration: 44.423782s)
•Price: $0.2796/hr, $208/mo

Here's the data formatted in table for easier viewing - courtesy of u/sergeant113. https://www.reddit.com/r/LocalLLaMA/comments/1dclmwt/comment/l7zrgzm/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

Top answer

1 of 5

10

Here's the data formatted in table for easier viewing: AWS Instances Instance Type | Cores | RAM | GPU | Prompt Eval Rate (tokens/s) | Eval Rate (tokens/s) | Price ($/hr) | Price ($/mo) | c7g.8xlarge | 32 | 64GB | - | 38.38 | 25.07 | 1.27 | 941.16 | r6g.4xlarge | 16 | 128GB | - | 10.15 | 8.29 | 0.88 | 657.10 | g4dn.xlarge | 4 | 16GB | 16GB | 222.23 | 41.71 | 0.58 | 434.50 | g4dn.2xlarge | 8 | 32GB | 32GB | 214.25 | 41.74 | 0.84 | 621.24 | g5.xlarge | 4 | 16GB | 24GB | 624.29 | 68.08 | 1.12 | 831.05 | g5.2xlarge | 8 | 32GB | 24GB | 624.48 | 66.67 | 1.35 | 1,000.96 Vs Local Machines Machine Type | Cores | RAM | GPU | Prompt Eval Rate (tokens/s) | Eval Rate (tokens/s) | M2 MacMini | - | 8GB | <8GB | 66.38 | 18.33 | M1 MacBook Air | - | 16GB | <16GB | 71.58 | 11.46 | Home PC w/RTX 3080 | - | 64GB | 10GB | 185.67 | 83.79

2 of 5

6

I use the Hetzner 16-core 32GB ARM instances when I need something cheap and dirty. The cost is 1/10th of even the cheapest AWS setup you have here, so if you aren't worried about speed that might be the way to go.

Amazon Web Services

amazonaws.cn › home › amazon ec2 › amazon ec2 g5 instances

Amazon EC2 G5 Instances

1 week ago - They also deliver up to 3.3x higher performance for ML training compared to G4dn instances. This makes them a cost-efficient solution for training moderately complex and single node machine learning models for natural language processing, computer vision, and recommender engine use cases. G5 instances are built on the Amazon Nitro System, a combination of dedicated hardware and lightweight hypervisor which delivers practically all of the compute and memory resources of the host hardware to your instances for better overall performance and security.

CloudPrice

cloudprice.net › aws › sagemaker › instances › ml.g5.2xlarge

ml.g5.2xlarge specs and pricing | Amazon SageMaker AI | CloudPrice

October 24, 2025 - Compare ml.g5.2xlarge Amazon SageMaker AI machine learning instance prices and specifications across regions, currencies, and workload types.

EC2 Pricing Calculator

costcalc.cloudoptimo.com › aws-pricing-calculator › ec2 › g5.xlarge

g5.xlarge Pricing and Specs: AWS EC2

The g5.xlarge instance is part of the g5 series, featuring 4 vCPUs and Up to 10 Gigabit of RAM, with Gpu Instances. It is available at a rate of $1.0060/hour. Price / HourN.

AWS re:Post

repost.aws › questions › QUTgC3561MQKaWwyazLh_YXQ › sagemaker-costs-for-ai-model

SageMaker costs for AI model | AWS re:Post

March 18, 2025 - For your deployment using an ml.g5.xlarge instance, you will be charged based on the time your endpoint is running, regardless of whether it's processing requests or idle.

CloudPrice

cloudprice.net › amazon web services › ec2 › g5.4xlarge

g5.4xlarge specs and pricing | AWS | CloudPrice

November 6, 2025 - Amazon EC2 instance g5.4xlarge with 16 vCPUs, 64 GiB RAM and 1 x NVIDIA A10G 22.35 GiB. Available in 16 regions starting from $1185.52 per month.

Economize

economize.cloud › resources › aws › pricing › ec2 › g5.2xlarge

g5.2xlarge pricing: $884.76 monthly - AWS EC2

1 week ago - The g5.2xlarge instance is in the g5 family with 8 vCPUs and 32 GiB of memory, priced from $1.21/hr or $884.76/mo.

CloudPrice

cloudprice.net › amazon web services › ec2 › g5.xlarge

g5.xlarge specs and pricing | AWS | CloudPrice

November 21, 2025 - Amazon EC2 instance g5.xlarge with 4 vCPUs, 16 GiB RAM and 1 x NVIDIA A10G 22.35 GiB. Available in 16 regions starting from $734.38 per month.

Cloudzero

advisor.cloudzero.com › aws › sagemaker › ml.g5.2xlarge

ml.g5.2xlarge SageMaker ML Instance Specs And Pricing

4 weeks ago - CloudZero's intelligent platform helps you optimize cloud costs and improve infrastructure efficiency.