athena vs redshift aws reddit

Scaling Analytics Platform: Choosing Between Athena, Redshift, or Other Services for Storing Data?

reddit.com › r › aws › comments › 1ao85gu › scaling_analytics_platform_choosing_between

User count doesn’t matter much. What is your query efficiency strategy,total data estate and data roadmap? Answer from Deleted User on reddit.com

reddit.com › r/aws › redshift vs athena

r/aws on Reddit: Redshift vs Athena

May 11, 2022 -

I'm building a web service via API Gateway that would allow users to run queries on a DB. The data is in S3 and I thought of using Athena and have Lambda run queries against it. Thing is, I see a lot of similar designs but with Redshift instead of Athena. One of our Principal Engineers said Redshift fits better for a web service compared to Athena (but I didn't ask why). Any idea why it's the case?

EDIT: for context the data in S3 is parquet and it is partitioned. I'm expecting a moderate number of users using the API.

Top answer

1 of 5

Athena queries are asynchronous. Your API call starts the query and returns that it's been successfully requested. In order to get the data, you need to poll the query status and then read the results from S3. Athena's latency is also comparable high, since it loads your data from S3 into a temporary environment for every query. Between those two, Athena doesn't tend to be a good fit for a REST API.

2 of 5

Redshift isnt serverless, so you'll have to have it running 730 hours a month to ensure round-the-clock service.

reddit.com › r/dataengineering › redshift spectrum vs athena

r/dataengineering on Reddit: Redshift Spectrum vs Athena

March 24, 2025 -

I have bunch of small Avro on S3 I need to build some data warehouse on top of that. With redshift the same queries takes 10x times longer in comparison to Athena. What may I do wrong?

The final objective is to have this data in redshift Table.

Top answer

1 of 5

Redshift Spectrum is dogshit. Literally. Kudos to AWS for making it work. I can’t imagine the smell in the data centers. More seriously this is a typical move of AWS to graft half-assed shit on existing technologies just to say it’s possible: yes you can query S3. You can, that’s it. It’s not optimised for it, it has trouble handling large rows, it has trouble handling different schemas, lots of trouble.

2 of 5

Redshift is terribly complex in terms of tuning and keeping things running smoothly-- I do not recommend it. As for your question, for querying objects in s3, Athena will almost always be faster. Athena was designed to query that sort of data. Redshift will be fast if you first load the data into redshift using a copy command . If you try to query objects at rest from redshift, you're actually using a service they tacked on later called redshift spectrum . And honestly, it's very poorly designed. There's a hard time getting your where conditions to actually work to prune data at the object level, so often times what it does is just copy all the data into a redshift format from the source you selected, and then run the actual filtering portion of the query.

Videos