A few thoughts about this Bedrock: fine tunning is only available for certain oss models, I believe mostly Llama, so that's probably a big limitation if you are looking tune other models. If this option works for you, is the easiest since it abstract all the infrastructure, you just call the API with your model-id and prompt. Sagemaker: Jumpstart is a part of Sagemaker that will give you easy access to Jupyter notebooks with a lot of hugging face models ready to start using. That means training, deploying, doing inference, etc. It might be a good starting point. Sagemaker Python SDK: this is probably something you should get familiar with, it will let you create processing and training jobs, serve some models for inference etc, everything directly from your python code. Sagemaker Infra: Jobs, inference endpoints, etc, will handle the underlying infrastructure creation for you, like nodes, storage, containers registry, etc. Inference: If you want to "self-host", I believe VLLM is the most popular, but it is really not easy to scale, specially when you need to divide your model on multiple GPUs. EC2: This is a primitive, it is possible to manually host here, but you usually want something in the middle, like EKS. Google for AI on EKS, there are some guidelines to run VLLM and even Ray to distribute the load between nodes. From Bedrock to EC2, that's the order of abstraction you will have. The lower you go, more stuff you have to handle. Last but not least, GPUs on AWS are not cheap and the are hard to come by. I work with startups and I struggle a lot when they need GPUs, with new accounts, you won't even be able to get the required quotas if you don't have access to your account team. If you got the quotas, getting the capacity without doing reservations is not easy. Keep it in mind. Answer from MinionAgent on reddit.com
🌐
Reddit
reddit.com › r/aws › advice needed: sagemaker vs bedrock for fine-tuned llama models (cost & serverless options)
r/aws on Reddit: Advice Needed: SageMaker vs Bedrock for Fine-Tuned Llama Models (Cost & Serverless Options)
January 20, 2025 -

Hi all,

I’m a self-taught ML enthusiast, and I’m really enjoying my journey so far. I’m hoping to get some advice from those with more experience.

So far, I’ve successfully fine-tuned a Llama model using both SageMaker JumpStart and Amazon Bedrock. (Interestingly, in Bedrock, I had to switch to a different AWS region to access the same model I used in SageMaker.) My ultimate goal is to build a web-based app for users to interact with my fine-tuned model. However, for now, I’m still in the testing phase to ensure the model generalises well to my dataset.

I’d love some guidance on whether I should stick with SageMaker or switch fully to Bedrock. My main concern is cost management, as I’d prefer to use a serverless endpoint to avoid keeping the model “always-on.” Here’s where I’m stuck:

SageMaker: I’ve been deploying real-time endpoints on low-cost instances and deleting them after testing, but this workflow feels inefficient. I tried configuring a serverless endpoint, but I discovered it doesn’t support models requiring certain features (e.g., AWS Marketplace packages, private Docker registries, or network isolation).

Bedrock: It requires provisioned throughput ($23.50/hour per model unit) to serve fine-tuned models. While it’s fully managed, this seems expensive for my testing phase, and I’ve also noticed that Bedrock doesn’t provide detailed insights into the fine-tuning process.

For a beginner like me, what would you recommend?

Should I stick with SageMaker real-time endpoints on a low-cost instance and delete them when not in use?

Would it make sense to fine-tune the model in SageMaker and then deploy it in Bedrock?

Is there another cost-effective solution I haven’t considered?

Thank you for your time and insights!

Top answer
1 of 2
4
I’ve been seeing a lot of questions about choosing between SageMaker and Bedrock for ML deployments, so let me break down what I’ve learned after working with both. The main thing to understand is that they serve different needs. SageMaker Serverless is your friend if you’re cost-conscious and working with smaller models. It’s basically pay-per-use with no minimum fees, scales to zero when idle, and while it has some limits (6GB RAM, 200 concurrent invocations), it’s great for testing and development. Bedrock, on the other hand, is more of a commitment. For fine-tuned models, you’re looking at $21-50 per hour per model unit with a mandatory 1 or 6-month commitment. This can add up to around $15.5K monthly plus storage fees. No serverless option for fine-tuned models. From my experience, SageMaker is best when you want control. You get to play with different instance types, really dig into model training, and optimize costs. Bedrock’s strength is simplicity - clean API integration, managed infrastructure, and it’s great for quick prototyping and scaling production workloads. For testing, I’d strongly recommend SageMaker. You’ll learn more about the fine-tuning process, have better cost control, and more room to experiment. Plus, here’s a pro tip: consider a hybrid approach. Use SageMaker for fine-tuning but leverage Bedrock’s base models for inference where it makes sense. If you’re building a web application, think about implementing a queue-based architecture - it’ll help manage costs while keeping response times reasonable. My two cents…
2 of 2
2
Bedrock PT is crazy expensive. I would just use Sagemaker. Especially for testing.
🌐
Reddit
reddit.com › r/aws › ec2 vs sagemaker vs bedrock for fine-tuning & serving a custom llm?
r/aws on Reddit: EC2 vs SageMaker vs Bedrock for fine-tuning & serving a custom LLM?
August 25, 2025 -

Hello! I am a Computer Vision Engineer, previously I have used the HPC center (basically lots of nodes with fancy GPUs) that we had partnership with to train / inference DL models and build pipelines.

Recently, started a new project, tho slightly different domain to what I used to work in - the task is to build a yet another "fancy and unique" chatbot.
Generally speaking, we want 1) fine-tune open-source LLM for our specific narrow domain (yes, we do want to do it), 2) design an app that will allow users to communicate with an LLM through Telegram, 3) be able to offload the weights of the trained model to our local machines.

I have never ever worked with AWS services before that, I have spent a couple of days going through the docs and some forums. Still have some questions left to answer :(

So my questions are:

  1. For the fine-tuning purpose should I use EC2 with GPU nodes / Sagemaker / Bedrock? The EC2+GPU looks like what I am most familiar with. However, there is also an opportunity to fine-tune on Bedrock as well as Sagemaker. Why should I choose one over another? Will I be able to easily offload weights after tuning the model? Generally speaking, I am trying to wrap my mind around what are the unique features of each of these services?

  2. What is the best practice / common strat for deploying and serving custom models? E.g. using ollama / vllm in EC2+GPU vs Creating an Sagemaker endpoint?

  3. Any potential "beginner traps" that I should be aware of during doing things with AWS?

Would like to hear about your experience. Will appreciate any advice!
Thanks in advance!

Top answer
1 of 2
3
A few thoughts about this Bedrock: fine tunning is only available for certain oss models, I believe mostly Llama, so that's probably a big limitation if you are looking tune other models. If this option works for you, is the easiest since it abstract all the infrastructure, you just call the API with your model-id and prompt. Sagemaker: Jumpstart is a part of Sagemaker that will give you easy access to Jupyter notebooks with a lot of hugging face models ready to start using. That means training, deploying, doing inference, etc. It might be a good starting point. Sagemaker Python SDK: this is probably something you should get familiar with, it will let you create processing and training jobs, serve some models for inference etc, everything directly from your python code. Sagemaker Infra: Jobs, inference endpoints, etc, will handle the underlying infrastructure creation for you, like nodes, storage, containers registry, etc. Inference: If you want to "self-host", I believe VLLM is the most popular, but it is really not easy to scale, specially when you need to divide your model on multiple GPUs. EC2: This is a primitive, it is possible to manually host here, but you usually want something in the middle, like EKS. Google for AI on EKS, there are some guidelines to run VLLM and even Ray to distribute the load between nodes. From Bedrock to EC2, that's the order of abstraction you will have. The lower you go, more stuff you have to handle. Last but not least, GPUs on AWS are not cheap and the are hard to come by. I work with startups and I struggle a lot when they need GPUs, with new accounts, you won't even be able to get the required quotas if you don't have access to your account team. If you got the quotas, getting the capacity without doing reservations is not easy. Keep it in mind.
2 of 2
1
If you want to own the weights → EC2 or SageMaker. If you’re fine with just calling an API → Bedrock. EC2: most familiar if you’ve done HPC before. Spin up a GPU box, run vLLM/Ollama, do your own fine-tuning. You’ll have to handle scaling, monitoring, and GPU quotas yourself (and those quotas are a pain on new accounts). Easiest to offload weights since they’re just files you can push to S3 and pull locally. SageMaker: sweet spot for managed ML on AWS. JumpStart gives you HuggingFace models with ready notebooks. Training jobs → save artifacts in S3 → deploy to a managed endpoint. Autoscaling endpoints are nice, but watch costs since they bill 24/7. Great option if you want some infra help but still need to walk away with your tuned weights. Bedrock: lowest effort, just API calls. But you can only fine-tune certain OSS models (mostly Llama). You don’t get to pull weights out after tuning, they stay in AWS. Good if you want “chatbot infra in 10 minutes,” bad if you need control. Serving side: For DIY, vLLM on EC2 is fast but scaling multi-GPU gets hairy (usually EKS/Ray territory). For managed, SageMaker endpoints handle scaling + monitoring out of the box. For serverless, Bedrock is just API usage. Beginner traps: GPU quotas, SageMaker endpoints racking up idle charges, IAM/S3 misconfig when exporting weights, and underestimating how hard scaling inference is outside of Bedrock. TL;DR – if you need local export → SageMaker or EC2. If you want AWS to abstract it all away → Bedrock.
🌐
Reddit
reddit.com › r/aws › anyone using bedrock or sagemaker for production-level llms? looking for insights on real-world performance.
r/aws on Reddit: Anyone using Bedrock or SageMaker for production-level LLMs? Looking for insights on real-world performance.
December 16, 2024 -

Hey everyone,

I’m looking into options for deploying production-level LLMs, such as GPT, Claude, or customized fine-tuned models, on AWS. I’m weighing the benefits of using Bedrock versus SageMaker and would greatly appreciate insights from anyone who has experience with GenAI workloads in production.

Here are a few specific points I'm interested in:

- Latency and throughput in actual workloads
- Cost/performance tradeoffs
- Experiences with model customization or prompt tuning
- Challenges in monitoring and scaling

Any real-world experiences, lessons learned, or pitfalls to avoid would be incredibly valuable!

Thanks so much in advance! 🙌

🌐
Reddit
reddit.com › r/aws › newbie trying to understand sagemaker/aws and llms
r/aws on Reddit: Newbie trying to understand sagemaker/aws and LLMs
February 25, 2024 -

The title is self explanatory. Basically i'm lost exploring ways of deploying cheap instances of models in sagemaker or aws in general so I can use to serve co-workers inside my company.

Bedrock doesn't suit me because these are not private instances and for big clients this is not a compliant solution. What every client/coworker tells me is "I don't want my data going inside openai servers", which has sense.

Sagemaker can be expensive por testing deployed models and this a "24hs inference solutions" that will burn money while there is no activity off-work hours.

Im not sure if I can deploy an ec2 instance with the model running inside? Im lost here how to do it.

Is anything like sagemaker serverless? Im lost here too lol

Maybe there is a better solution outside aws

I would love to learn a "clear and easy to understand map" of how to deploy a model to sagamaker or endpoint/lambda for inference. From this I hope to gain more experience and learn to do it for clients.

Top answer
1 of 5
8
Bedrock is still private, it should meet pretty much any compliance in terms of data stewardship. For example, it is even HIPAA compliant. You should speak with your AWS Technical Account Manager about this, they can provide info and make sure you are operating within whatever compliance you require. I highly suggest going that way because it doesn't sound like you have the experience necessary to jump right into SageMaker or EC2. You can get there, but it's much better to start with a fully hosted solution.
2 of 5
6
Bedrock is not "public". Bedrock has both multi-tenant and single-tenant endpoints (on-demand vs. provisioned throughput). In both cases your data remains private (see AWS Service Terms 50.3). In the case of a the cloud, there is no case where your data is not transmitted to someone's server... that is the whole point. If you mean that the data must be completely controlled by you in your cloud account (vs. the service/escrow account of Bedrock), SageMaker is one option. SageMaker spins resource within your Virtual Private Cloud (a logically separate network) where you have complete control. Anything else that gives you compute resource like EC2 VMs (that SageMaker also uses) are the same. If you have low throughput (e.g. more idle time than not) running instances 24/7 would indeed get "expensive". When talking about LLMs (like GPT) serverless features are usually not enough to run these (e.g. Lambda or SageMaker Serverless Inference). These models are large and require and accelerator (e.g. GPU) for any reasonable latency. If you don't require real-time inference, there are options like SageMaker Async Endpoint for asynchronous processing. If it still sounds like you want to use SageMaker, you can find out how to deploy open-source models from places like HuggingFace here . But to me it sounds like you are not very familiar with AWS. It is probably a good idea to get a grasp of the fundamentals like networking and security.
🌐
AWS
docs.aws.amazon.com › aws decision guides › aws decision guide › amazon bedrock or amazon sagemaker ai?
Amazon Bedrock or Amazon SageMaker AI? - Amazon Bedrock or Amazon SageMaker AI?
Amazon Bedrock offers a more accessible and straightforward way to integrate AI functionality into your projects. It’s appropriate for a broad audience, which includes developers and businesses, that has limited experience in building and training machine learning models, but wants to use ...
🌐
Reddit
reddit.com › r/datascience › seeking advice on amazon bedrock and azure
r/datascience on Reddit: Seeking Advice on Amazon Bedrock and Azure
November 6, 2024 -

Hello everyone. I’m currently exploring AI infrastructure and platform for a new project and I’m trying to decide between Amazon Bedrock and Azure (AI Infrastructure & AI Studio). I’ve been considering both but would love to hear about your real-world experiences with them.

Has anyone used Amazon Bedrock or Azure AI Infrastructure and Azure AI Studio? How would you compare the two in terms of ease of use, performance, and overall flexibility? Are there specific features from either platform that stood out to you, or particular use cases where one was clearly better than the other?

Any advice or insights would be greatly appreciated. Thanks in advance!

Find elsewhere
🌐
Reddit
reddit.com › r/aws › may i use sagemaker/bedrock to build apis to use lm and llm?
r/aws on Reddit: May I use Sagemaker/Bedrock to build APIs to use LM and LLM?
February 13, 2024 -

Hi,

I've never used any cloud service, I only used Google Cloud VMs as remote machine to develop without thinking about money. Now I'm evaluating to use an AWS product instead of turning on/off a CLoud VM with a GPU.

My pipeline involves a chain of python scripts with the usage of huggingface models (BERT-based for BERTopic and Mystral or other free LLM) as inference model (no train needed).

I saw that SageMaker and Bedrock offer the access to cloud LM/LLM (respectively), but the options are to many to understand which is the most adherent to my necessity, I just want to create an API with models of my choice :')

Top answer
1 of 7
6
Amazon Bedrock is a fully managed service that makes foundation models (FMs) from Amazon and leading AI startups available through an API so you can choose from various FMs to find the model that's best suited for your use case. With the Amazon Bedrock serverless experience, you can quickly get started, easily experiment with FMs, privately customize FMs with your own data, and seamlessly integrate and deploy them into your applications using AWS tools and capabilities. Agents for Amazon Bedrock is a fully managed capability that makes it easier for developers to create generative-AI applications that can deliver up-to-date answers based on proprietary knowledge sources and complete tasks for a wide range of use cases. Amazon SageMaker is a fully managed service that helps data scientists and developers build, train, and deploy machine learning models at scale. It provides a range of features and tools to simplify the machine learning workflow, from data preprocessing and model training to model deployment and monitoring. So in a nutshell, Bedrock is the easiest way to build and scale gen​erative AI applications with foundation models (FMs); whereas SageMake is a managed machine learning service in general.
2 of 7
2
Sagemaker Foundation models allows you more flexibility and choice than bedrock. In Sagemaker infrastrucuture is being provisioned on your behalf and you can train from scratch a new model, pick for a large list of models supported by Jumpstart (including Hugginface), and if the model can be fine-tuned you can do this on sagemaker. Note many foundational models cannot be finetuned and others are only available for Research and cannot be used in commercial applications. Sagemaker will give you the most flexibility but involves more work in setting up and you are charged for endpoints when they are running. Bedrock is focused on offering an API driven and serverless experience. It offers a curated list of foundational models. You are only charged for what you use (there is no infrastructure costs involved). Only a subset of model on Bedrock will allow fine-tuning but again this will be a very simple api driven process. Bedrock and Sagemaker offer different characteristics and what is the right choice will depend on your use case, your foundational model choice (or even train your own), the need for fine-tuning for the model, do you have a data science team or are you more developer oriented.
🌐
Medium
medium.com › @jsathiskumar › aws-sagemaker-vs-8bb24b9d6455
AWS SageMaker vs. AWS Bedrock for Generative AI – Which to Choose? Plus, a Guide to Building RAG Applications: | by Dr. Sathiskumar Jothi | Medium
August 7, 2025 - Scalability: Both are strong, but Bedrock (9/10) edges out due to serverless auto-scaling. ... You want rapid deployment with pre-trained models (e.g., Claude, Titan). Your team has limited ML expertise or prefers serverless simplicity.
🌐
Caylent
caylent.com › blog › bedrock-vs-sage-maker-whats-the-difference
Amazon Bedrock vs. Amazon SageMaker AI - What's The Difference? | Caylent
With the integration with Lambda, Bedrock will use the configured FM for the Agent to automatically identify the course of action and run/invoke the correct Lambda function in a proper order, so user requests and tasks are solved. Integration and deployment with other AWS Services: One other great feature is the integration and deployment with ongoing AWS tools, including SageMaker AI projects, experiments and pipelines, for a full ML workflow based on SageMaker AI features and Bedrock API.
🌐
CloudOptimo
cloudoptimo.com › home › blog › amazon bedrock vs amazon sagemaker: a comprehensive comparison
Amazon Bedrock vs Amazon SageMaker: A Comprehensive Comparison
March 13, 2025 - Although both platforms are part ... SageMaker offers a comprehensive suite for custom model creation and training, whereas Bedrock streamlines the experience by focusing on pre-trained models for rapid deployment...
🌐
DEV Community
dev.to › aws-builders › sagemaker-jumpstart-vs-bedrock-45k
SageMaker Jumpstart vs Bedrock - DEV Community
September 14, 2023 - Amazon SageMaker Jumpstart empowers ... endpoints. In contrast, Amazon Bedrock is a fully managed service provided by Amazon that enables you to make API calls to access models hosted on AWS....
🌐
Reddit
reddit.com › r/aws › aws re:invent 2024 key findings - iceberg, s3 tables, sagemaker lakehouse, redshift, catalogs, governance, gen ai bedrock
r/aws on Reddit: AWS re:Invent 2024 key findings - Iceberg, S3 Tables, SageMaker Lakehouse, Redshift, Catalogs, Governance, Gen AI Bedrock
January 4, 2025 -

Hi all, my name is Sanjeev Mohan. I am a former Gartner analyst who went independent 3.5 years ago. I maintain an active blogging site on Medium and a podcast channel on YouTube. I recently published my content from last month's re:Invent conference. This year, it took me much longer to post my content because it took a while to understand the interplay between Apache Iceberg-supported S3 Tables and SageMaker Lakehouse. I ended up creating my own diagram to explain AWS's vision, which is truly excellent. However, there have been many questions and doubts about the implementation. I hope my content helps demystify some of the new launches. Thanks.

https://sanjmo.medium.com/groundbreaking-insights-from-aws-re-invent-2024-20ef0cad7f59

https://youtu.be/tSIMStJTJ8I

🌐
Swiftorial
swiftorial.com › matchups › aws › sagemaker-vs-bedrock
Matchups: AWS SageMaker vs AWS Bedrock | Aws Comparison
August 22, 2025 - SageMaker’s curve is steep—train models in days, master pipelines in weeks. Bedrock’s gentler—run inferences in hours, optimize prompts in days. Communities thrive: SageMaker’s forums share training tips; Bedrock’s community covers LLMs. Example: SageMaker’s docs cover pipelines; Bedrock’s cover model selection.
🌐
SaaSworthy
saasworthy.com › home › new saas software › bedrock vs amazon sagemaker
Bedrock vs Amazon SageMaker Comparison | SaaSworthy.com
Which product offers a wider range of features for machine learning development? Amazon SageMaker offers a more comprehensive set of features for machine learning development, including data preparation, model training, deployment, and monitoring.
🌐
Deepchecks
deepchecks.com › blog › amazon bedrock vs sagemaker ai: when to use each one
Amazon Bedrock vs SageMaker AI: When to Use Each One
April 24, 2025 - Amazon Bedrock integrates with leading providers, as of the time of writing this article. Let’s look at its model catalog, which contains serverless models (which will be the main focus) as well as marketplace models. Amazon Sagemaker AI (formerly Amazon SageMaker) is a suite of tools designed to help build, train, and deploy ML models at scale.