🌐
Amazon Web Services
aws.amazon.com › machine learning › amazon sagemaker ai › amazon sagemaker train
Train Machine Learning Models - Amazon SageMaker Model Training - AWS
6 days ago - To train deep learning models faster, SageMaker AI helps you select and refine datasets in real time. SageMaker distributed training libraries can automatically split large models and training datasets across AWS GPU instances, or you can use third-party libraries, such as DeepSpeed, Horovod, ...
🌐
AWS
aws.amazon.com › blogs › machine-learning › use-deep-learning-frameworks-natively-in-amazon-sagemaker-processing
Use deep learning frameworks natively in Amazon SageMaker Processing | Artificial Intelligence
December 23, 2021 - Until recently, customers who wanted to use a deep learning (DL) framework with Amazon SageMaker Processing faced increased complexity compared to those using scikit-learn or Apache Spark. This post shows you how SageMaker Processing has simplified running machine learning (ML) preprocessing and postprocessing tasks with popular frameworks such as PyTorch, TensorFlow, Hugging Face, MXNet, and […]
🌐
AWS
aws.amazon.com › blogs › aws › amazon-sagemaker-simplifies-training-deep-learning-models-with-billions-of-parameters
Amazon SageMaker Simplifies Training Deep Learning Models With Billions of Parameters | AWS News Blog
November 3, 2022 - Today, I’m extremely happy to announce that Amazon SageMaker simplifies the training of very large deep learning models that were previously difficult to train due to hardware limitations. In the last 10 years, a subset of machine learning named deep learning (DL) has taken the world by storm.
🌐
DeepLearning.AI
community.deeplearning.ai › ai discussions
A Practical Guide to Building with AWS Sagemaker - AI Discussions - DeepLearning.AI
October 5, 2024 - This was originally posted to Stanford’s CS 230 EdStem forum and has been modified to be more general and to remove links I’ve spent the last couple of months working on the CS 230 final project using AWS Sagemaker and I wanted to share what I’ve learned so that other students can take ...
🌐
DataCamp
datacamp.com › tutorial › aws-sagemaker-tutorial
The Complete Guide to Machine Learning on AWS with Amazon SageMaker | DataCamp
June 19, 2024 - With the SDK, you can train and deploy models using popular deep learning frameworks, algorithms provided by Amazon, or your algorithms built into SageMaker-compatible Docker images. boto3 is another official Python SDK, but it serves as a bridge between your code and all AWS services, not ...
🌐
DEV Community
dev.to › aws-builders › accelerating-deep-learning-with-amazon-sagemaker-3f54
Accelerating Deep Learning with Amazon SageMaker - DEV Community
January 22, 2025 - Retain Control Over Runtime: Define exactly how your code runs, while still benefiting from SageMaker’s distributed training features and integrations with AWS services. With these varied options, Amazon SageMaker delivers the flexibility and scalability needed for virtually any machine learning task—ranging from classic supervised learning with pre-built algorithms to cutting-edge deep learning applications using TensorFlow or PyTorch at scale.
🌐
AWS
aws.amazon.com › blogs › machine-learning › use-aws-deep-learning-containers-with-amazon-sagemaker-ai-managed-mlflow
Use AWS Deep Learning Containers with Amazon SageMaker AI managed MLflow | Artificial Intelligence
September 18, 2025 - They include NVIDIA CUDA, cuDNN, and other necessary tools, with AWS managing the updates of DLAMIs. Together, DLAMIs and DLCs provide ML practitioners with the infrastructure and tools to accelerate deep learning in the cloud at scale. SageMaker managed MLflow delivers comprehensive lifecycle management with one-line automatic logging, enhanced comparison capabilities, and complete lineage tracking.
Find elsewhere
🌐
GeeksforGeeks
geeksforgeeks.org › machine learning › what-is-sagemaker-in-aws
What is SageMaker in AWS? - GeeksforGeeks
July 22, 2020 - It allows data scientists and developers to build, train, and deploy machine learning (ML) models quickly and at scale. Unlike traditional ML development, which requires managing complex infrastructure for training and hosting, SageMaker abstracts ...
🌐
GitHub
github.com › aws › amazon-sagemaker-examples
GitHub - aws/amazon-sagemaker-examples: Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
Amazon SageMaker Training is a fully managed machine learning (ML) service offered by SageMaker that helps you efficiently build and train a wide range of ML models at scale. The core of SageMaker jobs is the containerization of ML workloads ...
Starred by 10.8K users
Forked by 7K users
Languages   Jupyter Notebook 94.2% | Python 4.9% | Roff 0.6% | Shell 0.2% | Dockerfile 0.1% | HTML 0.0%
🌐
Reddit
reddit.com › r/aws › aws sagemaker or ec2 for deep learning
r/aws on Reddit: AWS Sagemaker or EC2 for deep learning
July 27, 2020 -

Hi ! I have a question for deep learning practitioners who are familiar with AWS products.

In my workplace, we are assessing two options : using Amazon SageMaker or having an EC2 instance with GPU.

We mainly need the computing power (GPU) and nothing more. We would like to have full control over which version is each package since our app needs specific versions for some packages and won't work otherwise. We are about to train a complex model with many components and loss functions.

Which is more adapted for our case cost-wise and from an ease-of-use point of view ?

Thank you in advance for your help.

Top answer
1 of 5
16
SageMaker for both cost-wise and ease of use. But the one thing I want to be explicit is that it’s cost-effective in a TCO perspective. Just like RDS is more expensive than doing EC2 but higher TCO to manage it. https://aws.amazon.com/blogs/machine-learning/lowering-total-cost-of-ownership-for-machine-learning-and-increasing-productivity-with-amazon-sagemaker/ At the heart of it. Sagemaker just uses EC2 and containers to orchestrate the necessary compute. You can be explicit with your package versions on either or platform.
2 of 5
12
If you want an ML setup rather than an EC2 instance with a GPU, I'd recommend trying SageMaker. It has many rough edges, but here are some parts you get for free: authentication securely proxying Jupyter common ML libraries preinstalled easier access to other SageMaker APIs data persistence after machine restarts easier for "create your own machine" environment. On the rough edges, there are many, but there are solutions there is no SSH access ( how to fix it ) custom conda environments are removed after machine restarts ( how to fix it ) Jupyter extensions are wiped after machine restarts ( how to fix it ) it can be costly if you forget to turn off these GPU loaded machines, but there is solution for that, too . Few months ago I've been evaluating EC2 vs SageMaker for our team (4 people) and went with SageMaker for reasons above (I don't want to handle auth, want self-serve env etc). I had to invest time in scripts I've mentioned, but I don't think I've made a mistake so far.
🌐
Dive into Deep Learning
d2l.ai › chapter_appendix-tools-for-deep-learning › sagemaker.html
23.2. Using Amazon SageMaker — Dive into Deep Learning 1.0.3 documentation
Deep learning applications may demand so much computational resource that easily goes beyond what your local machine can offer. Cloud computing services allow you to run GPU-intensive code of this book more easily using more powerful computers. This section will introduce how to use Amazon SageMaker to run the code of this book. First, we need to sign up an account at https://aws...
🌐
Pluralsight
pluralsight.com › blog › guides
Build Your First Deep Learning Solution with AWS Sagemaker | Online Courses, Learning Paths, and Certifications - Pluralsight
You're well on your way to skillfully using Amazon SageMaker to accomplish your and your organization's goals. 1: A low-level service on AWS that allows users to spin up Virtual Machines with specified computational capability and configurations.
🌐
Stack Overflow
stackoverflow.com › questions › 77675598 › deploying-my-own-created-custom-deep-learning-model-on-aws-sagemaker
amazon ec2 - Deploying my own created (custom) Deep learning Model on AWS SageMaker - Stack Overflow
Connect and share knowledge within a single location that is structured and easy to search. Learn more about Teams ... I'm new to AWS SageMaker, and my question may seem silly. I've created a deep learning model that requires GPU and deployed that model on EC2 instances (g4dn.4xlarge ) with Auto Scaling and a load balancer.
🌐
GitHub
github.com › aws › deep-learning-containers
GitHub - aws/deep-learning-containers: One stop shop for running AI/ML on AWS.
Seamlessly integrate AWS Deep Learning Containers with Amazon SageMaker's managed MLflow service to streamline your ML experiment tracking, model management, and deployment workflow.
Starred by 1.1K users
Forked by 522 users
Languages   Python 92.9% | Shell 5.9%
🌐
Amazon Web Services
aws.amazon.com › machine learning › amazon sagemaker ai › amazon sagemaker inference
Machine Learning Inference - Amazon SageMaker Model Deployment - AWS
6 days ago - Amazon SageMaker inference supports ... machine learning frameworks such as TensorFlow, PyTorch, ONNX, and XGBoost. If none of the pre-built Docker images serve your needs, you can build your own container for use with CPU backed multi-model endpoints. SageMaker inference supports most popular model servers such as TensorFlow Serving, TorchServe, NVIDIA Triton, AWS multi-model server. Amazon SageMaker AI offers specialized deep learning ...
🌐
AWS
aws.amazon.com › blogs › machine-learning › accelerate-deep-learning-model-training-up-to-35-with-amazon-sagemaker-smart-sifting
Accelerate deep learning model training up to 35% with Amazon SageMaker smart sifting | Artificial Intelligence
November 30, 2023 - In this post, we explored the public preview of smart sifting, a new capability of SageMaker that can reduce deep learning model training costs by up to 35%. This feature improves data efficiency during training that filters out less informative data samples.
🌐
Hugging Face
huggingface.co › docs › sagemaker › en › index
Hugging Face on AWS
Hugging Face provides Inference Deep Learning Containers (DLCs) to AWS users, optimized environments preconfigured with Hugging Face libraries for inference, natively integrated in SageMaker SDK and JumpStart.
🌐
AWS
docs.aws.amazon.com › amazon sagemaker › developer guide › model training › distributed training in amazon sagemaker ai
Distributed training in Amazon SageMaker AI - Amazon SageMaker AI
SageMaker AI provides distributed training libraries and supports various distributed training options for deep learning tasks such as computer vision (CV) and natural language processing (NLP). With SageMaker AI’s distributed training libraries, you can run highly scalable and cost-effective ...