Showing results for
Amazon Athena is well suited for interactive analytics and exploration of data in your data lake or any data source through an extensible connector framework without worrying about ingesting or processing data. Amazon Athena is built on open-source engines and frameworks such as Spark, Presto, and Apache Iceberg, giving customers the flexibility to use Python or SQL or work on open data formats. If customers want to do ***interactive analytics*** using open-source frameworks and data formats, Amazon Athena is a great place to start. Amazon Redshift is a fully managed data warehouse. With its Massively Parallel Processing (MPP) architecture that separates storage and compute and machine learning led automatic optimization capabilities, a data warehouse like Amazon Redshift, whether it's serverless or provisioned, is a great choice for customers that need the best price performance at any scale for ***complex BI and analytics workloads***. Using Amazon Redshift, you can also directly query data in open formats (such as Parquet or ORC) in the Amazon S3 data lake, or query data in operational databases, such as Amazon Aurora and Amazon RDS PostgreSQL and MySQL. Amazon QuickSight is a cloud-scale business intelligence (BI) service that you can use to deliver easy-to-understand insights to the people who you work with, wherever they are. Amazon QuickSight connects to your data in the cloud and combines data from many different sources. Here is a document on the Modern Data architecture with AWS which gives an idea on how the purpose built services can be used together. [https://aws.amazon.com/big-data/datalakes-and-analytics/modern-data-architecture/] Answer from Srikant Das on repost.aws
๐ŸŒ
ChaosSearch
chaossearch.io โ€บ blog โ€บ when-to-deploy-aws-redshift-or-athena-use-cases
AWS Redshift vs AWS Athena: Best Use Cases for Each
April 29, 2024 - Another way to think about the differences between Redshift and Athena is to focus on the varying use cases that each service lends itself to. Examples of use cases that are a good fit for Redshift include: Event log analytics: Cloud application and event logs are an example of structured, consistent data that is easy to analyze within Redshift clusters. Real-time analytics: AWS Redshift can be integrated with data stream processing services like Amazon Kinesis to enable near real-time analysis of large-scale data streams.
๐ŸŒ
Reddit
reddit.com โ€บ r/aws โ€บ redshift vs athena
r/aws on Reddit: Redshift vs Athena
May 11, 2022 -

I'm building a web service via API Gateway that would allow users to run queries on a DB. The data is in S3 and I thought of using Athena and have Lambda run queries against it. Thing is, I see a lot of similar designs but with Redshift instead of Athena. One of our Principal Engineers said Redshift fits better for a web service compared to Athena (but I didn't ask why). Any idea why it's the case?

EDIT: for context the data in S3 is parquet and it is partitioned. I'm expecting a moderate number of users using the API.

๐ŸŒ
Upsolver
upsolver.com โ€บ home โ€บ blog โ€บ athena or redshift? 4 questions to decide
Athena or Redshift? 4 Questions to Decide | Upsolver
May 28, 2024 - Redshift is the more natural choice for data warehouse reporting, Athena for ad-hoc queries against S3 storage. Redshift would be the better choice if you have data coming in from diverse sources and you would like to transform that data, enforce ...
Discussions

AWS RDS, Amazon Athena, Amazon QuickSight and Amazon Redshift
Hello 1. What is difference between Amazon Athena and Amazon Redshift? 2. Which one is the right choice? 3. If we are using RDS, Amazon Athena and Amazon Redshift together in architecture. How do... More on repost.aws
๐ŸŒ repost.aws
1
0
March 30, 2023
amazon web services - Comparison between using DBT + (Athena vs Redshift or Snowflake) as a data warehouse - which path should I take? - Stack Overflow
My knowledge is mostly limited to AWS services.) ... Sign up to request clarification or add additional context in comments. ... Thanks John, I didn't know a lot of that! I'm keen to understand the pros and cons of using Athena vs Redshift - why would you choose one over the other? More on stackoverflow.com
๐ŸŒ stackoverflow.com
amazon web services - Athena vs Redshift Spectrum - Stack Overflow
Connectivity. Its easy enough to connect to Athena using API,JDBC or ODBC but many more products offer "standard out of the box" connection to Redshift ยท Also, for either solution, make sure you use the AWS Glue metadata, rather than Athena as there are fewer limitations. More on stackoverflow.com
๐ŸŒ stackoverflow.com
Redshift vs Athena : aws
I'm building a web service via API Gateway that would allow users to run queries on a DB. The data is in S3 and I thought of using Athena and have... More on old.reddit.com
๐ŸŒ r/aws
๐ŸŒ
Firebolt
firebolt.io โ€บ comparison โ€บ athena-vs-redshift
Athena vs Redshift | Performance & Pricing: Comparison Guide
Thus, 8 RPU is equivalent to 16 vCPU / 128GB RAM. The minimum RPU is 8. Athena is a shared multi-tenant resource, with no guarantees on the amount or availability of the resources allocated for your queries.
Top answer
1 of 1
3
Amazon Athena is well suited for interactive analytics and exploration of data in your data lake or any data source through an extensible connector framework without worrying about ingesting or processing data. Amazon Athena is built on open-source engines and frameworks such as Spark, Presto, and Apache Iceberg, giving customers the flexibility to use Python or SQL or work on open data formats. If customers want to do ***interactive analytics*** using open-source frameworks and data formats, Amazon Athena is a great place to start. Amazon Redshift is a fully managed data warehouse. With its Massively Parallel Processing (MPP) architecture that separates storage and compute and machine learning led automatic optimization capabilities, a data warehouse like Amazon Redshift, whether it's serverless or provisioned, is a great choice for customers that need the best price performance at any scale for ***complex BI and analytics workloads***. Using Amazon Redshift, you can also directly query data in open formats (such as Parquet or ORC) in the Amazon S3 data lake, or query data in operational databases, such as Amazon Aurora and Amazon RDS PostgreSQL and MySQL. Amazon QuickSight is a cloud-scale business intelligence (BI) service that you can use to deliver easy-to-understand insights to the people who you work with, wherever they are. Amazon QuickSight connects to your data in the cloud and combines data from many different sources. Here is a document on the Modern Data architecture with AWS which gives an idea on how the purpose built services can be used together. [https://aws.amazon.com/big-data/datalakes-and-analytics/modern-data-architecture/]
๐ŸŒ
AWS
aws.amazon.com โ€บ blogs โ€บ big-data โ€บ query-aws-glue-data-catalog-views-using-amazon-athena-and-amazon-redshift
Query AWS Glue Data Catalog views using Amazon Athena and Amazon Redshift | Amazon Web Services
August 8, 2024 - The Sales business unitโ€™s data steward (AWS Identity and Access Management (IAM) role: product_owner_role), who owns the customer and customer address datasets, plans to create and share non-sensitive details of preferred customers with the Marketing unitโ€™s data analyst (business_analyst_role) for their campaign use case. The Marketing team analyst plans to use Athena for interactive analysis for the marketing campaign and later, use Amazon Redshift to generate the campaign report.
๐ŸŒ
Edge Delta
edgedelta.com โ€บ company โ€บ knowledge-center โ€บ athena-vs-redshift
Athena vs Redshift: Choosing the Right AWS Analytics Tool
May 6, 2025 - ... When it comes to data analytics ... very different purposes. Athena is a fully serverless query engine that lets you analyze data directly from Amazon S3 with no need to manage infrastructure....
Find elsewhere
๐ŸŒ
Chartio
chartio.com โ€บ resources โ€บ tutorials โ€บ redshift-vs-athena
Redshift vs Athena | Tutorial by Chartio
June 6, 2016 - Redshift is best used for large and structured datasets. Athena is an interactive query service that allows you to conveniently analyze data stored in Amazon Simple Storage Service (S3) by using basic SQL.
Top answer
1 of 1
5

Reporting systems need a means of running SQL against data.

Traditionally, this meant that a Database was required, and all databases (at the time) consisted of both Storage and Compute. There was no capability to separate these two components because the database stored its data in a proprietary format and the Compute component was required access that data.

As data volumes increased, traditional databases struggled to provide fast performance. This led to a new class of Data Warehouse systems that specialise in querying tables with billions of rows and Terabytes of data. These systems typically use parallel infrastructure and columnar storage split across multiple storage nodes to provide fast performance. Examples are: Amazon Redshift, Snowflake.

The next evolution came from Presto (and can be traced back to Hadoop), which was the idea of completely separating the Compute and Storage components of databases. Optimized for querying, Presto could query data stored in cloud services (eg Amazon S3) without having to load the data into the database (known as a 'query engine'). This was not only a mind-blowing concept, but depending upon the data format (eg Snappy-compressed Parquet) could actually rival the speed of Data Warehouses. Plus, the fact that they are cloud-native, it was easy to scale Compute as needed for short periods of time.

The main thing to understand about Query Engines is that data is not 'loaded' into them. Rather, when a query runs, they go to the storage service, look at data stored in whatever format and then calculate the answer to the query. Data can be added by simply adding another file in the storage location.

Examples of Query Engines are: Presto, Amazon Athena, Amazon Redshift Spectrum

The downside of using Query Engines is that they are not good at inserting/updating data. This has been addressed by the Delta Lake file format that uses a combination of Parquet files and logs files to allow data to be inserted, updated and deleted. This is the main focus of Databricks.

Of course, if your data needs are small, it is quite acceptable to use a traditional database (eg PostgreSQL) as a Data Warehouse.

The best approach is to start with something small until it no longer meets your needs. Then, move to something more powerful. If Amazon Athena is meeting your needs, then there is no need to move to anything else.

(My apologies for not including Google and other services as examples. My knowledge is mostly limited to AWS services.)

๐ŸŒ
Sprinkle Data
sprinkledata.com โ€บ blogs โ€บ athena-vs-redshift-unraveling-the-battle-of-cloud-data-warehouses
Amazon Athena vs. Amazon Redshift: Choosing the Right Data Warehousing Solution
March 15, 2024 - They can be effortlessly integrated with AWS Glue for ETL processing, AWS Lambda for serverless data transformations, and Amazon QuickSight for data visualization and business intelligence. Redshift's MPP architecture and compatibility with various BI tools make it a powerful choice for enterprises looking to build sophisticated data analytics pipelines and derive actionable insights. On the other hand, Athena's simplicity and compatibility with standard SQL enable users to interact with the data in S3 directly, eliminating the need for complex data transformations and enhancing the data exploration process
๐ŸŒ
AWS
docs.aws.amazon.com โ€บ amazon athena โ€บ user guide โ€บ what is amazon athena? โ€บ when should i use athena?
When should I use Athena? - Amazon Athena
The query engine in Amazon Redshift has been optimized to perform especially well on running complex queries that join large numbers of very large database tables. When you need to run queries against highly structured data with lots of joins across lots of very large tables, choose Amazon Redshift...
๐ŸŒ
ExamTopics
examtopics.com โ€บ exams โ€บ amazon โ€บ aws-certified-solutions-architect-associate-saa-c03 โ€บ view
AWS Certified Solutions Architect - Associate SAA-C03 Exam - Free Exam Q&As, Page 1 | ExamTopics
4 days ago - A. Use Amazon Redshift to load all the content into one place and run the SQL queries as needed. B. Use Amazon CloudWatch Logs to store the logs. Run SQL queries as needed from the Amazon CloudWatch console. C. Use Amazon Athena directly with Amazon S3 to run the queries as needed. D. Use AWS Glue to catalog the logs.
๐ŸŒ
AWS
docs.aws.amazon.com โ€บ amazon athena โ€บ user guide โ€บ use athena sql โ€บ connect to data sources โ€บ use amazon athena federated query โ€บ available data source connectors โ€บ amazon athena redshift connector
Amazon Athena Redshift connector - Amazon Athena
For more information, see Lambda quotas in the AWS Lambda Developer Guide. Because Redshift does not support external partitions, all data specified by a query is retrieved every time. Like Redshift, Athena treats trailing spaces in Redshift CHAR types as semantically insignificant for length ...
๐ŸŒ
RisingWave
risingwave.com โ€บ blog โ€บ aws-athena-vs-redshift-which-is-more-cost-effective
AWS Athena vs Redshift: Which is More Cost-Effective? - RisingWave: Real-Time Event Streaming Platform
AWS Athena and Amazon Redshift stand out as two powerful data services in the cloud analytics landscape. AWS Athena Pricing offers a serverless, pay-per-query model, making it highly cost-effective for ad-hoc queries and exploratory analysis.
๐ŸŒ
Predictivehacks
predictivehacks.com
Redshift vs EMR vs Athena vs S3 Select vs Glacier Select โ€“ Predictive Hacks
We have provided several tutorials on AWS Athena, S3 Select etc. The question is when to use each service.
๐ŸŒ
RisingWave
risingwave.com โ€บ blog โ€บ redshift-vs-athena-choose-the-best-aws-service
Redshift vs Athena: Choose the Best AWS Service - RisingWave: Real-Time Event Streaming Platform
June 21, 2024 - Amazon Redshift and Amazon Athena are two powerful AWS services that cater to different data processing needs. Amazon Redshift is a fully managed data warehouse service designed for storing and querying large datasets efficiently.
๐ŸŒ
Hevo Data
hevodata.com โ€บ home โ€บ blog โ€บ data warehousing
Amazon Redshift Vs Athena: Compare On 7 Key Factors
December 30, 2024 - In Redshift, both compute and storage layers are coupled, however in Redshift Spectrum, compute and storage layers are decoupled. Athena is a serverless analytics service where an Analyst can directly perform the query execution over AWS S3.
๐ŸŒ
Webuters
webuters.com โ€บ amazon-redshift-vs-athena
Amazon Redshift vs Athena: Features, Pricing & Use
As far as AWS Athena vs Redshift spectrum is concerned, the former has an edge over the latter in terms of partitioning. While Redshift does not have the feature of partitioning a key on its own, Athena does the job effortlessly.
Top answer
1 of 5
31

I have used both across a few different use cases and conclude:

Advantages of Redshift Spectrum:

  • Allows creation of Redshift tables
  • Able to join Redshift tables with Redshift spectrum tables efficiently

If you do not need those things then you should consider Athena as well

Athena differences from Redshift spectrum:

  • Billing. This is the major difference and depending on your use case you may find one much cheaper than the other
  • Performance. I found Athena slightly faster.
  • SQL syntax and features. Athena is derived from presto and is a bit different to Redshift which has its roots in postgres.
  • Connectivity. Its easy enough to connect to Athena using API,JDBC or ODBC but many more products offer "standard out of the box" connection to Redshift

Also, for either solution, make sure you use the AWS Glue metadata, rather than Athena as there are fewer limitations.

2 of 5
16

This question has been up for quite a time, but still, I think I can contribute something to the discussion.

What is Athena?

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. (From the Doc)

Pretty straight forward, right?

Then comes the question of what is Redshift Spectrum and why Amazon folks made it when Athena was pretty much a solution for external table queries?

So, AWS folks wanted to create an extension to Redshift (which is pretty popular as a managed columnar datastore at this time) and give it the capability to talk to external tables(typically S3). But they wanted to make life easier for Redshift users, mostly analytics people. Many analytics tools don't support Athena but support Redshift at this time. But creating your Reshift cluster and storing data was a bottleneck. Again Redshift isn't that horizontally scalable and it takes some downtime in case of adding new machines. If you are a Redshift user, making your storage cheaper makes your life so much easier basically.

I suggest you use Redshift spectrum in the following cases:

  • You are an existing Redshift user and you want to store more data in Redshift.

  • You want to move colder data to an external table but still, want to join with Redshift tables in some cases.

  • Spark unloading of your data and if you just want to import data to Pandas or any other tools for analyzing.

And Athena can be useful when:

  • You are a new user and don't have Redshift cluster. Access to Spectrum requires an active, running Redshift instance. So Redshift Spectrum is not an option without Redshift.
  • As Spectrum is still a developing tool and they are kind of adding some features like transactions to make it more efficient.
  • BTW Athena comes with a nice REST API , so go for it you want that.

All to say Redshift + Redshift Spectrum is indeed powerful with lots of promises. But it has still a long way to go to be mature.

๐ŸŒ
StackShare
stackshare.io โ€บ stackups โ€บ amazon-athena-vs-amazon-redshift
Amazon Athena vs Amazon Redshift
It is optimized for datasets ranging ... of most traditional data warehousing solutions. Amazon Athena can be classified as a tool in the "Big Data Tools" category, while Amazon Redshift is grouped under "Big Data as a Service"....