What is AWS Lambda current status?
How can I get notified when AWS Lambda is not working or have outages?
Additionally, StatusGator leverages its own advanced monitoring algorithms to detect potential issues, sending you Early Warning Signals even before the official status page updates. If Cloudflare is down, you'll know immediately. Sign up now and stay on top of current status updates, it's free!
Where do you get the official AWS Lambda status?
We use the official AWS Lambda status page. Here are links to their status page and other helpful links to check if AWS Lambda is down:
- AWS Lambda Status Page
- AWS Lambda Home Page
- @awscloud
Is AWS down for everyone? I'm seeing very slow responses.
It all started when I was trying to by something from Mercado Livre, one of the biggest portals here in Brazil. Couldn´t load account specifics, cart or change other profile settings, like adding a credit card.
So I decided to buy it from Amazon, same behavior. Went to Brazil's Down Detector and it seems to me that all services that rely on AWS are failing.
Went to the the US Down Detector site and I am seeing what seems to be the same cascading failure right now.
Any1 facing similar problems?
For a long time, it was standard practice at AWS to publish post mortem reports in the wake of significant outages. You can find a list going back a decade here. However, nothing has been added since December 2021.
My work environment is pretty simple compared to what lots of organizations do in AWS – we run apps on a low number of EC2 instances, in ASGs with very basic scaling policies, connecting to RDS backends, and putting certain static assets in S3. Everything is built for cross-AZ tolerance, and we leverage CDN caching (Fastly) heavily so despite being 98% hosted in us-east-1 we haven't gotten bit by the big outages the way a lot of orgs have.
We do have some things in Lambda however which it would be nice to rely on. When the recent outage happened I knew I'd have to review our resilience model to see how we could prevent being impacted by a similar outage in the future. Replicating functions across regions is not difficult, but some of the things they need to interact with would need to be replicated as well, or at least made reachable outside our main VPC, so, not that easy... My initial thought was, let's wait and see what the post mortem looks like, because from what I was hearing at the time, having functions hosted in other regions might not have provided a working failover option. My impression, just based on what people were posting in this forum and elsewhere, was that control plane services were the source of the problem, lots of mentions of IAM not working, maybe due to STS going down? My own experience was that I couldn't get the Lambda console to load in any region, not just USE1. So on the face of it, seems like replicating functions out of region might not buy us anything at all.
So my question is twofold, the first part is, have they really given up on issuing post mortems in these cases, and if so, doesn't that kinda suck? The second, maybe more useful part is, in the absence of a detailed report naming a root cause, what are folks here doing to improve the resilience of their Lambda-based services?
You can see from my user history that I am just a lurker here, but I appreciate any feedback from this community that I glean so much from – thank you.