You have 126 million rows in that table. It's going to take more than a second on a single dc1.large node.

Here's some ways you could improve the performance:

More nodes

Spreading data across more nodes allows more parallelization. Each node adds additional processing and storage. Even if your data volume only justifies one node, if you want more performance, add more nodes.

SORTKEY

For the right type of query, the SORTKEY can be the best way to improve query speed. Sorting data on disk allows Redshift to skip over blocks that it knows does not contain relevant data.

For example, your query has WHERE brandID = 3927, so having brandID as the SORTKEY would make this extremely efficient because very few disk blocks would contain data for one brand.

Interleaved sorting is rarely the best sorting method to use because it is less efficient than a single or compound sort key and takes a long time to VACUUM. If the query you have shown is typical of the type of queries you are running, then use a compound sort key of brandId, ti or ti, brandId. It will be much more efficient.

SORTKEYs are typically a date column, since they are often found in a WHERE clause and the table will be automatically sorted if data is always appended in time order.

The Interleaved Sort would be causing Redshift to read many more disk blocks to find your data, thereby significantly increasing query time.

DISTKEY

The DISTKEY should typically be set to the field that is most used in a JOIN statement on the table. This is because data relating to the same DISTKEY value is stored on the same slice. This won't have such a large impact on a single node cluster, but it is still worth getting right.

Again, you have only shown one type of query, so it is hard to recommend a DISTKEY. Based on this query alone, I would recommend DISTKEY EVEN so that all slices participate in the query. (It is also the default DISTKEY if no specific DISTKEY is selected.) Alternatively, set DISTKEY to a field not shown -- but certainly don't use brandId as the DISTKEY otherwise only one slice will participate in the query shown.

VACUUM

VACUUM your tables regularly so that the data is stored in SORTKEY order and deleted data is removed from storage.

Experiment!

Optimal settings depend upon your data and the queries you typically run. Perform some tests to compare SORTKEY and DISTKEY values and choose the settings that perform the best. Then, test again in 3 months to see if your queries or data has changed enough to make other settings more efficient.

Answer from John Rotenstein on Stack Overflow
Top answer
1 of 3
6

You have 126 million rows in that table. It's going to take more than a second on a single dc1.large node.

Here's some ways you could improve the performance:

More nodes

Spreading data across more nodes allows more parallelization. Each node adds additional processing and storage. Even if your data volume only justifies one node, if you want more performance, add more nodes.

SORTKEY

For the right type of query, the SORTKEY can be the best way to improve query speed. Sorting data on disk allows Redshift to skip over blocks that it knows does not contain relevant data.

For example, your query has WHERE brandID = 3927, so having brandID as the SORTKEY would make this extremely efficient because very few disk blocks would contain data for one brand.

Interleaved sorting is rarely the best sorting method to use because it is less efficient than a single or compound sort key and takes a long time to VACUUM. If the query you have shown is typical of the type of queries you are running, then use a compound sort key of brandId, ti or ti, brandId. It will be much more efficient.

SORTKEYs are typically a date column, since they are often found in a WHERE clause and the table will be automatically sorted if data is always appended in time order.

The Interleaved Sort would be causing Redshift to read many more disk blocks to find your data, thereby significantly increasing query time.

DISTKEY

The DISTKEY should typically be set to the field that is most used in a JOIN statement on the table. This is because data relating to the same DISTKEY value is stored on the same slice. This won't have such a large impact on a single node cluster, but it is still worth getting right.

Again, you have only shown one type of query, so it is hard to recommend a DISTKEY. Based on this query alone, I would recommend DISTKEY EVEN so that all slices participate in the query. (It is also the default DISTKEY if no specific DISTKEY is selected.) Alternatively, set DISTKEY to a field not shown -- but certainly don't use brandId as the DISTKEY otherwise only one slice will participate in the query shown.

VACUUM

VACUUM your tables regularly so that the data is stored in SORTKEY order and deleted data is removed from storage.

Experiment!

Optimal settings depend upon your data and the queries you typically run. Perform some tests to compare SORTKEY and DISTKEY values and choose the settings that perform the best. Then, test again in 3 months to see if your queries or data has changed enough to make other settings more efficient.

2 of 3
0

Some time the issue could be due to locks being acquired by other processes. You can refer: https://aws.amazon.com/premiumsupport/knowledge-center/prevent-locks-blocking-queries-redshift/

🌐
AWS
docs.aws.amazon.com › amazon redshift › database developer guide › query performance tuning › query troubleshooting › query takes too long
Query takes too long - Amazon Redshift
For more information, see Automatic table optimization · Your queries might be writing to disk for at least part of the query execution. For more information, see Query performance improvement. You might be able to improve overall system performance by creating query queues and assigning different ...
Discussions

What can I do about redshift slowness?
You can find a list of community-submitted learning resources here: https://dataengineering.wiki/Learning+Resources I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns. More on reddit.com
🌐 r/dataengineering
28
43
May 6, 2023
Redshift Long Running Query hang indefinitely: Query completes on RedShift
First off love the module and use it often for large ETL processes for multiple clients, certainly the best Db connection module on npm (and I often have to connect to literally every type of Db in the same @project). After upgrading fro... More on github.com
🌐 github.com
6
March 27, 2019
After a Redshift's Maintenance my queries are too slow
It is possible to limit the negative effects of query performance after your cluster maintenance?Thanks More on repost.aws
🌐 repost.aws
3
0
January 17, 2022
performance - Simple queries to Redshift really slow - Database Administrators Stack Exchange
I just started testing AWS Redshift and populated a single node with the AWS sample data. Querying a table with 10 or ~400 rows takes around 2 seconds, uncached. I'm not sure if I'm misunderstand... More on dba.stackexchange.com
🌐 dba.stackexchange.com
August 28, 2016
🌐
AWS
docs.aws.amazon.com › amazon redshift › database developer guide › query performance tuning › query processing › factors affecting query performance
Factors affecting query performance - Amazon Redshift
More nodes means more processors and more slices, which enables your queries to process faster by running portions of the query concurrently across the slices. However, more nodes also means greater expense, so you need to find the balance of cost and performance that is appropriate for your system.
🌐
Reddit
reddit.com › r/dataengineering › what can i do about redshift slowness?
r/dataengineering on Reddit: What can I do about redshift slowness?
May 6, 2023 -

Hi Reddit DE - I'm a data analyst that changed jobs to join a dinosaur working with Redshift. I was previously working with Bigquery for SQL scripts, where just looking at table samples (e.g. SELECT * FROM table LIMIT 5) took microseconds. Under the AWS Redshift architecture, these same table sampling jobs now take 3+ minutes and I'm going crazy.

The admins have set up resources dedicated under a user cluster, so things could be worse, but is there anything small you suggest I push for to make life more bearable? I think I need to start by asking for more 2x, 3x more resource slots, but please stop me if this sounds stupid.

🌐
Chartio
chartio.com › learn › amazon-redshift › identifying-slow-queries-in-redshift
Amazon Redshift: detecting queries that are taking unusually long
February 7, 2018 - Detecting queries that are taking unusually long or are run on a higher frequency interval are good candidates for query tuning. In this tutorial we will look at a diagnostic query designed to help you do just that. During its entire time spent querying against the database that particular query is using up one of your cluster’s concurrent connections which are limited by Amazon Redshift.
🌐
Civis Analytics
support.civisanalytics.com › hc › en-us › articles › 360032992652-Troubleshooting-Redshift-Slowness
Troubleshooting Redshift Slowness – Civis Analytics
November 20, 2023 - There are multiple ways that running queries can cause Redshift slowness. It's important to know how to find out what queries are running on your cluster. To do this you can run the following SQL statement: SELECT * FROM stv_recents WHERE status = 'Running' ORDER BY duration DESC; It is recommended that you evaluate the longest-running queries to see if one is potentially blocking others.
🌐
AWS
docs.aws.amazon.com › amazon redshift › database developer guide › query performance tuning › query troubleshooting › query hangs
Query hangs - Amazon Redshift
Your client connection to the database appears to hang or time out when running long queries, such as a COPY command. In this case, you might observe that the Amazon Redshift console displays that the query has completed, but the client tool itself still appears to be running the query.
Find elsewhere
🌐
AWS
docs.aws.amazon.com › amazon redshift › database developer guide › query performance tuning › query troubleshooting
Query troubleshooting - Amazon Redshift
Amazon Redshift will no longer support the creation of new Python UDFs starting Patch 198. Existing Python UDFs will continue to function until June 30, 2026. For more information, see the blog post · . This section provides a quick reference for identifying and addressing some of the most common and most serious issues that you are likely to encounter with Amazon Redshift queries. Connection fails · Query hangs · Query takes too long ·
🌐
Twilio Segment
segment.com › docs › connections › storage › warehouses › redshift-tuning
Speeding Up Redshift Queries | Twilio
December 2, 2025 - Number and size of columns. Column sizes and the number of columns also affect load time. If you have long property values or lots of properties per event, the load may take longer as well. To make sure you have enough headroom for quick queries while using Segment Warehouses, here are some tips!
🌐
GitHub
github.com › brianc › node-postgres › issues › 1863
Redshift Long Running Query hang indefinitely: Query completes on RedShift · Issue #1863 · brianc/node-postgres
March 27, 2019 - After upgrading from v6.4 to v7.9 long running queries began to hang indefinitely despite completing successfully on the Redshift (RS) instance.
Author   OTooleMichael
🌐
Integrate.io
integrate.io › home › blog › big data › 15 performance tuning techniques for amazon redshift
15 Performance Tuning Techniques for Amazon Redshift | Integrate.io
November 25, 2025 - Doing so would remove 374,371 queries from your Redshift database. Such a single query would take just a few seconds, instead of 125 minutes. Use Amazon RDS and DBLINK to use Redshift as an OLTP. In the post “Have your Postgres Cake and Eat it Too” we describe this approach in detail.
🌐
Medium
medium.com › @AustinBG › anatomy-of-a-redshift-query-bf7433aca5b9
Anatomy of a Redshift Query. Or, Why is My Query Slow? | by Austin Gibbons | Medium
July 30, 2021 - However, creating a new secondary cluster is not a turnkey access to all the same data in your primary redshift cluster. You have to create a Data Share and manage tables and permissions within the data share. We explored this direction but ran into a “cold-start” problem. Data isn’t cached onto disk from S3 until the first time you access the table, and we were observing that the first time a query was run from the consuming cluster it would take about twice as long to execute, so we decided to re-evaluate later rather than risk paying a trailblazer tax.
🌐
AWS re:Post
repost.aws › knowledge-center › redshift-cluster-degrade
Troubleshoot cluster performance issues in Amazon Redshift | AWS re:Post
December 11, 2023 - Check the Amazon Redshift Advisor recommendations. Review the query execution alerts and excessive disk usage. Check for locking issues and long-running sessions or transactions.
🌐
AWS re:Post
repost.aws › knowledge-center › redshift-query-planning-time
Find the cause of high query planning time in Amazon Redshift | AWS re:Post
February 16, 2024 - Queries with exclusive locks on a production load can increase the lock wait time. This increase causes your query planning time in Amazon Redshift to be much longer than the actual execution time.
🌐
CloudThat
cloudthat.com › home › blogs › boosting amazon redshift query speed best practices and tips
Boosting Amazon Redshift Query Speed Best Practices and Tips
December 13, 2024 - Queries with multiple joins or aggregations are particularly susceptible to this issue. ... Amazon Redshift distributes data across nodes to enable parallel processing. However, uneven data distribution (data skew) can overload some nodes, creating bottlenecks. Properly configuring distribution styles (e.g., EVEN, KEY) is critical to balancing the load across nodes. ... Too many concurrent queries can saturate the cluster’s resources in multi-user environments.
🌐
AWS re:Post
repost.aws › articles › AR58IPQ86FSFOHH42GOTgBlg › amazon-redshift-monitoring-and-troubleshooting-query-performance-using-system-tables
Amazon Redshift: Monitoring and troubleshooting query performance using system tables | AWS re:Post
February 3, 2026 - To debug at the query level, obtain the query ID of the faster run and compare it with the slow version's query ID of the same query using the Redshift system tables mentioned in Section 2 (for completed or aborted queries). This helps troubleshoot differences and deviations in query text, execution plan, data volume processed, etc., between the faster and slower runs. Please note that this comparison should be done for the same query for different runs to understand the root cause for long running.