🌐
Vantage
instances.vantage.sh › aws › ec2 › g4dn.xlarge
g4dn.xlarge pricing and specs - Vantage
The g4dn.xlarge instance is in the GPU instance family with 4 vCPUs, 16 GiB of memory and up to 25 Gibps of bandwidth starting at $0.526 per hour.
🌐
AWS
aws.amazon.com › amazon ec2 › instance types › g4 instances
Amazon EC2 G4 Instances — Amazon Web Services (AWS)
1 week ago - They provide up to 8 NVIDIA T4 GPUs, 96 vCPUs, 100 Gbps networking, and 1.8 TB local NVMe-based SSD storage and are also available as bare metal instances. G4dn instances are equipped with NVIDIA T4 GPUs which deliver up to 40X better low-latency throughput than CPUs, so more requests can be ...
🌐
Vantage
instances.vantage.sh › aws › ec2 › g4dn.4xlarge
g4dn.4xlarge pricing and specs - Vantage
The g4dn.4xlarge instance is in the GPU instance family with 16 vCPUs, 64 GiB of memory and up to 25 Gibps of bandwidth starting at $1.204 per hour.
🌐
Vantage
instances.vantage.sh › aws › ec2 › g4dn.8xlarge
g4dn.8xlarge pricing and specs - Vantage
The g4dn.8xlarge instance is in the GPU instance family with 32 vCPUs, 128 GiB of memory and 50 Gibps of bandwidth starting at $2.176 per hour.
🌐
EC2 Pricing Calculator
costcalc.cloudoptimo.com › aws-pricing-calculator › ec2 › g4dn.xlarge
g4dn.xlarge Pricing and Specs: AWS EC2
The g4dn.xlarge instance is part of the g4dn series, featuring 4 vCPUs and Up to 25 Gigabit of RAM, with Gpu Instances.
🌐
NVIDIA Developer
developer.nvidia.com › blog › getting-the-most-out-of-nvidia-t4-on-aws-g4-instances
Getting the Most Out of NVIDIA T4 on AWS G4 Instances | NVIDIA Technical Blog
August 21, 2022 - Using BERT Large, which is about three times larger than BERT Base, you can get a million sentences inferenced for around 30 cents. The efficiency of the T4 GPU that powers the AWS g4dn.xlarge instance means you can cost effectively deploy smart, ...
🌐
CloudPrice
cloudprice.net › amazon web services › ec2 › g4dn.2xlarge
g4dn.2xlarge specs and pricing | AWS | CloudPrice
Amazon EC2 instance g4dn.2xlarge with 8 vCPUs, 32 GiB RAM and 1 x NVIDIA T4 16 GiB. Available in 23 regions starting from $548.96 per month.
🌐
Cloudzero
advisor.cloudzero.com › aws › ec2 › g4dn.xlarge
g4dn.xlarge Instance Specs And Pricing
CloudZero's intelligent platform helps you optimize cloud costs and improve infrastructure efficiency.
🌐
Bigbell
bigbell.ai › coin › aws › ec2 › g4dn.xlarge
g4dn.xlarge pricing and specs - BigBell
The g4dn.xlarge instance is in the gpu instance family with 4 vCPUs, 16.0 GiB of memory and up to 25 Gibps of bandwidth starting at $0.526 per hour.
Find elsewhere
🌐
Vantage
instances.vantage.sh › aws › ec2 › g4dn.2xlarge
g4dn.2xlarge pricing and specs - Vantage
The g4dn.2xlarge instance is in the GPU instance family with 8 vCPUs, 32 GiB of memory and up to 25 Gibps of bandwidth starting at $0.752 per hour.
🌐
CloudPrice
cloudprice.net › amazon web services › ec2 › g4dn.xlarge
g4dn.xlarge specs and pricing | AWS | CloudPrice
Amazon EC2 instance g4dn.xlarge with 4 vCPUs, 16 GiB RAM and 1 x NVIDIA T4 16 GiB. Available in 23 regions starting from $383.98 per month.
🌐
Computer Weekly
computerweekly.com › news › 252471099 › AWS-G4-aims-to-lower-cost-of-GPU-powered-AI-inference
AWS G4 aims to lower cost of GPU-powered AI inference | Computer Weekly
“According to customers, machine learning inference can represent up to 90% of overall operational costs for running machine learning workloads.” · On-demand pricing of the g4dn.xlarge, four virtual core instance, with one GPU and 16GB ...
🌐
Umbrella
umbrellacost.com › home › learning center › aws cost management › amazon ec2 g4 instances
Using AWS EC2 G4dn and G4ad | Amazon EC2 G4 | Umbrella
April 10, 2025 - In December 2020, AWS released the Amazon EC2 G4ad instance subfamily — powered by AMD Radeon Pro V520 GPUs and second-generation AMD EPYC processors with up to 2.4 TB of local NVMe storage — that delivers up to 40% better price performance ...
🌐
Reddit
reddit.com › r/localllama › benchmarking inexpensive aws instances
r/LocalLLaMA on Reddit: Benchmarking Inexpensive AWS Instances
June 10, 2024 -

I recently did some testing using Dolphin-Llama3 across various (inexpensive-ish) AWS instances to compare performance. The results are in line with what one might expect.

Testing was done using default settings with Ollama. I spun up a new instance on Ubuntu, installed Ollama and ran it with Dolphin-Llama3 —verbose.

Key Takeaways:

-Fastest Prompt Eval Rate: AWS g5 (fastest AWS instance tested)
-Fastest Eval Rate: Home PC w/RTX 3080
-Best Cost-Performance Balance: AWS g4dn.xlarge offers a good balance of performance and cost, at $0.58/hr.
-GPU speed is the key differentiator. Within the same family of models, such as the g4dn and g5 instances, the evaluation rates remain consistent. If the model fits in GPU memory there is no need for more cores/memory.
-I did notice that the more system memory available the greater number of tokens used in the output.

Test Results

AWS Instances

c7g.8xlarge (Compute Instance)
•32 cores, 64GB RAM
•Prompt Eval Rate: 38.38 tokens/s
•Eval Rate: 25.07 tokens/s
•Price: $1.27/hr, $941.16/mo
r6g.4xlarge (Memory Instance)
•16 cores, 128GB RAM
•Prompt Eval Rate: 10.15 tokens/s
•Eval Rate: 8.29 tokens/s
•Price: $0.88/hr, $657.10/mo
g4dn.xlarge (GPU Instance)
•4 cores, 16GB RAM, 16GB GPU
•Prompt Eval Rate: 222.23 tokens/s
•Eval Rate: 41.71 tokens/s
•Price: $0.58/hr, $434.50/mo
g4dn.2xlarge (GPU Instance)
•8 cores, 32GB RAM, 32GB GPU
•Prompt Eval Rate: 214.25 tokens/s
•Eval Rate: 41.74 tokens/s
•Price: $0.84/hr, $621.24/mo
g5.xlarge (GPU Instance)
•4 cores, 16GB RAM, 24GB GPU
•Prompt Eval Rate: 624.29 tokens/s
•Eval Rate: 68.08 tokens/s
•Price: $1.12/hr, $831.05/mo
g5.2xlarge (GPU Instance)
•8 cores, 32GB RAM, 24GB GPU
•Prompt Eval Rate: 624.48 tokens/s
•Eval Rate: 66.67 tokens/s
•Price: $1.35/hr, $1,000.96/mo

Local Machines

M2 MacMini
•M2, 8GB RAM, <8GB GPU
•Prompt Eval Rate: 66.38 tokens/s
•Eval Rate: 18.33 tokens/s
M1 MacBook Air
•M1, 16GB RAM, <16GB GPU
•Prompt Eval Rate: 71.58 tokens/s
•Eval Rate: 11.46 tokens/s
Home PC w/RTX 3080
•Intel i5, 64GB RAM, 10GB GPU
•Prompt Eval Rate: 185.67 tokens/s
•Eval Rate: 83.79 tokens/s

Oracle Ampere

Ampere 16 Core, 32GB RAM
•Prompt Eval Rate: 11.96 tokens/s (Duration: 1m34.955180835s)
•Eval Rate: 9.01 tokens/s (Duration: 1m28.461256s)
•Price: $0.1276/hr, $95/mo
Ampere 32 Core, 32GB RAM
•Prompt Eval Rate: 22.54 tokens/s (Duration: 47.93207936s)
•Eval Rate: 14.11 tokens/s (Duration: 44.423782s)
•Price: $0.2796/hr, $208/mo

Here's the data formatted in table for easier viewing - courtesy of u/sergeant113. https://www.reddit.com/r/LocalLLaMA/comments/1dclmwt/comment/l7zrgzm/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

🌐
EC2 Pricing Calculator
costcalc.cloudoptimo.com › aws-pricing-calculator › ec2 › g4dn.8xlarge
g4dn.8xlarge Pricing and Specs: AWS EC2
The g4dn.8xlarge instance is part of the g4dn series, featuring 32 vCPUs and 50 Gigabit of RAM, with Gpu Instances.
🌐
AWSstatic
d1.awsstatic.com › events › reinvent › 2021 › How_to_select_Amazon_EC2_GPU_instances_for_deep_learning_sponsored_by_NVIDIA_CMP328-S.pdf pdf
© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved.
g4dn.(2/4/8/16)xlarge for more vCPUs and higher system memory ... Best multi-GPU instance for single-node training and running parallel experiments: p3.8xlarge (4 V100 GPUs, 16 GB per
🌐
Reddit
reddit.com › r/stablediffusion › rtx 3060 vs aws g4dn.xlarge
r/StableDiffusion on Reddit: RTX 3060 vs AWS g4dn.xlarge
May 29, 2023 -

Hi guys,

I'm using SD on AWS g4dn.xlarge and I'm pretty happy with the results. Sometimes I'm getting 'Out of memory', but restarting A1111 fixes the problem.I'm going to buy RTX 3060, with a price of it comparing to what I pay to AWS, I can recoup it in 3+ months.

I can't find any comparations to check.Maybe someone can make a test with RTX 3060 and run this prompt:

((photo:1.2)), A cute cat mage, glowing fire sword, staff, dramatic lighting, dynamic pose, dynamic camera, masterpiece, best quality, dark shadows, ((dark fantasy)), detailed, realistic, 8k uhd, high quality((photo:1.2)), A cute cat mage, glowing fire sword, staff, dramatic lighting, dynamic pose, dynamic camera, masterpiece, best quality, dark shadows, ((dark fantasy)), detailed, realistic, 8k uhd, high qualityNegative prompt: canvas frame, (high contrast:1.2), (over saturated:1.2), (glossy:1.1), cartoon, 3d, ((disfigured)), ((bad art)), ((b&w)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), Photoshop, video game, ugly, tiling, poorly drawn hands, 3d render, ((watermarks)), smooth, plastic, blurry, low-resolution, deep-fried, oversaturatedSteps: 30, Sampler: Euler a, CFG scale: 7, Seed: 408625209, Size: 512x512, Model hash: cc6cb27103, Model: v1-5-pruned-emaonly, Version: v1.5.1

Model: v1-5-pruned-emaonlyIt took 5 sec. and 5.35it/s - 5.48it/s to generate an 512*512 image on AWS.

What it/s you've got on RTX 3060?

Thank you so much.