What are your pros to using Sagemaker? Seems to me that it’s a little dead whereas bedrock is the future due to it’s ease of use and flexibility specially for getting to use something that’s already “built”
Hi all,
I’m a self-taught ML enthusiast, and I’m really enjoying my journey so far. I’m hoping to get some advice from those with more experience.
So far, I’ve successfully fine-tuned a Llama model using both SageMaker JumpStart and Amazon Bedrock. (Interestingly, in Bedrock, I had to switch to a different AWS region to access the same model I used in SageMaker.) My ultimate goal is to build a web-based app for users to interact with my fine-tuned model. However, for now, I’m still in the testing phase to ensure the model generalises well to my dataset.
I’d love some guidance on whether I should stick with SageMaker or switch fully to Bedrock. My main concern is cost management, as I’d prefer to use a serverless endpoint to avoid keeping the model “always-on.” Here’s where I’m stuck:
SageMaker: I’ve been deploying real-time endpoints on low-cost instances and deleting them after testing, but this workflow feels inefficient. I tried configuring a serverless endpoint, but I discovered it doesn’t support models requiring certain features (e.g., AWS Marketplace packages, private Docker registries, or network isolation).
Bedrock: It requires provisioned throughput ($23.50/hour per model unit) to serve fine-tuned models. While it’s fully managed, this seems expensive for my testing phase, and I’ve also noticed that Bedrock doesn’t provide detailed insights into the fine-tuning process.
For a beginner like me, what would you recommend?
Should I stick with SageMaker real-time endpoints on a low-cost instance and delete them when not in use?
Would it make sense to fine-tune the model in SageMaker and then deploy it in Bedrock?
Is there another cost-effective solution I haven’t considered?
Thank you for your time and insights!
Videos
Hello beautiful people,
I am exploring my options on moving my LLM to a cloud service, I am planning on using llama2 (maybe also fine-tune it), I am new to cloud sevices, what is more suitable for RAG implementation Bedrock or SageMaker?
Hello! I am a Computer Vision Engineer, previously I have used the HPC center (basically lots of nodes with fancy GPUs) that we had partnership with to train / inference DL models and build pipelines.
Recently, started a new project, tho slightly different domain to what I used to work in - the task is to build a yet another "fancy and unique" chatbot.
Generally speaking, we want 1) fine-tune open-source LLM for our specific narrow domain (yes, we do want to do it), 2) design an app that will allow users to communicate with an LLM through Telegram, 3) be able to offload the weights of the trained model to our local machines.
I have never ever worked with AWS services before that, I have spent a couple of days going through the docs and some forums. Still have some questions left to answer :(
So my questions are:
For the fine-tuning purpose should I use EC2 with GPU nodes / Sagemaker / Bedrock? The EC2+GPU looks like what I am most familiar with. However, there is also an opportunity to fine-tune on Bedrock as well as Sagemaker. Why should I choose one over another? Will I be able to easily offload weights after tuning the model? Generally speaking, I am trying to wrap my mind around what are the unique features of each of these services?
What is the best practice / common strat for deploying and serving custom models? E.g. using ollama / vllm in EC2+GPU vs Creating an Sagemaker endpoint?
Any potential "beginner traps" that I should be aware of during doing things with AWS?
Would like to hear about your experience. Will appreciate any advice!
Thanks in advance!
Hey everyone,
I’m looking into options for deploying production-level LLMs, such as GPT, Claude, or customized fine-tuned models, on AWS. I’m weighing the benefits of using Bedrock versus SageMaker and would greatly appreciate insights from anyone who has experience with GenAI workloads in production.
Here are a few specific points I'm interested in:
- Latency and throughput in actual workloads
- Cost/performance tradeoffs
- Experiences with model customization or prompt tuning
- Challenges in monitoring and scaling
Any real-world experiences, lessons learned, or pitfalls to avoid would be incredibly valuable!
Thanks so much in advance! 🙌
The title is self explanatory. Basically i'm lost exploring ways of deploying cheap instances of models in sagemaker or aws in general so I can use to serve co-workers inside my company.
Bedrock doesn't suit me because these are not private instances and for big clients this is not a compliant solution. What every client/coworker tells me is "I don't want my data going inside openai servers", which has sense.
Sagemaker can be expensive por testing deployed models and this a "24hs inference solutions" that will burn money while there is no activity off-work hours.
Im not sure if I can deploy an ec2 instance with the model running inside? Im lost here how to do it.
Is anything like sagemaker serverless? Im lost here too lol
Maybe there is a better solution outside aws
I would love to learn a "clear and easy to understand map" of how to deploy a model to sagamaker or endpoint/lambda for inference. From this I hope to gain more experience and learn to do it for clients.
Hello everyone. I’m currently exploring AI infrastructure and platform for a new project and I’m trying to decide between Amazon Bedrock and Azure (AI Infrastructure & AI Studio). I’ve been considering both but would love to hear about your real-world experiences with them.
Has anyone used Amazon Bedrock or Azure AI Infrastructure and Azure AI Studio? How would you compare the two in terms of ease of use, performance, and overall flexibility? Are there specific features from either platform that stood out to you, or particular use cases where one was clearly better than the other?
Any advice or insights would be greatly appreciated. Thanks in advance!
Hi,
I've never used any cloud service, I only used Google Cloud VMs as remote machine to develop without thinking about money. Now I'm evaluating to use an AWS product instead of turning on/off a CLoud VM with a GPU.
My pipeline involves a chain of python scripts with the usage of huggingface models (BERT-based for BERTopic and Mystral or other free LLM) as inference model (no train needed).
I saw that SageMaker and Bedrock offer the access to cloud LM/LLM (respectively), but the options are to many to understand which is the most adherent to my necessity, I just want to create an API with models of my choice :')
I am thinking of training LLMs on Sagemaker but want to do it for free with minimal to no expense. I did some digging around and found that i need to create something called as Domain and then create a profile choosing the instance types. I need your help to understand this. Please layout ur thoughts and experiences.
Hi all, my name is Sanjeev Mohan. I am a former Gartner analyst who went independent 3.5 years ago. I maintain an active blogging site on Medium and a podcast channel on YouTube. I recently published my content from last month's re:Invent conference. This year, it took me much longer to post my content because it took a while to understand the interplay between Apache Iceberg-supported S3 Tables and SageMaker Lakehouse. I ended up creating my own diagram to explain AWS's vision, which is truly excellent. However, there have been many questions and doubts about the implementation. I hope my content helps demystify some of the new launches. Thanks.
https://sanjmo.medium.com/groundbreaking-insights-from-aws-re-invent-2024-20ef0cad7f59
https://youtu.be/tSIMStJTJ8I