The Redshift leader node is the same size and class of compute as the compute nodes. Typically this means that the leader is over provisioned for the role it plays but since its role is so important and impactful if things slows down, it is good that it is over provisioned. The leader needs to compile and optimized the queries and perform final steps in queries (final sort for example). It communicates with the session clients and handles all their requests. If the leader becomes overloaded all these activities slow down creating significant performance issues. It is not good that your leader is hitting 100% CPU often enough for you to notice. I bet the seems sluggish when this happens.

There are a number of ways I've seen "leader abuse" and it usually becomes a problem when bad patterns are copied between users. In no particular order:

  • Large data literals in queries (INSERT ... VALUES ...). This puts your data through the query compiler on the leader node. This is not what it is design to do and is very expensive for the leader. Use the COPY command to bring data into the cluster. (Just bad, don't do this)
  • Over use of COMMIT. A commits cause an update to the coherent state of the database and needs to run through the "commit queue" and creates work for the leader and the compute nodes. Having COMMITs every other statement can cause this queue to back up and work to generally back up.
  • Too many slots defined in the WLM. Redshift can typically only efficiently run between 1 and 2 dozen queries at once. Setting the total slot count very high (like 50) can lead to very inefficient operation and high CPU loads. Depending on workload this can show up for compute or occasionally the lead node.
  • Large data output through SELECT statements. SELECTs return data but when this data is many GBs in size the management of this data movement (and sorting) is done by the leader node. If large amounts of data need to be extracted from Redshift it should be done with an UNLOAD statement.
  • Overuse of large cursors. Cursors can be an important tool and needed for many BI tools but cursors are located on the leader and overuse can lead to reduced leader attention on other tasks.
  • Many / large UNLOADs with parallel off. UNLOADs generally come from the compute nodes straight to S3 but with "parallel off" all the data is routed to the leader node where it is combined (sorted) and sent to S3.

While none of the above of problems in and of themselves, it is when these are overused, used in ways they are not intended, or all at once that the leader starts to be impacted. It also comes down to what you intend to do with your cluster - if it support BI tools then you may have a lot of cursors but this load on the leader is part of the cluster's intent. Issue often arise when the cluster's intent is to all things to everybody.

If your workload for Redshift is leader function heavy and you are efficiently using the leader node (no large literals, using COPY and UNLOAD, etc.) then high leader workload is what you want. You're getting the most out of the critical resource. However, most use Redshift to perform analytics on large data which is the function of the compute nodes. A highly loaded leader can detract significantly from this mission and needs to be addressed.

Another way that leader can get stressed is when clusters are configured with many smaller node types instead of fewer bigger nodes. Since the leader is the same size as the compute nodes many smaller nodes means you have a small leader doing the work. Something to consider but I'd make sure you don't have unneeded leader node stressers before investing in a resize.

Answer from Bill Weiner on Stack Overflow
Top answer
1 of 2
6

The Redshift leader node is the same size and class of compute as the compute nodes. Typically this means that the leader is over provisioned for the role it plays but since its role is so important and impactful if things slows down, it is good that it is over provisioned. The leader needs to compile and optimized the queries and perform final steps in queries (final sort for example). It communicates with the session clients and handles all their requests. If the leader becomes overloaded all these activities slow down creating significant performance issues. It is not good that your leader is hitting 100% CPU often enough for you to notice. I bet the seems sluggish when this happens.

There are a number of ways I've seen "leader abuse" and it usually becomes a problem when bad patterns are copied between users. In no particular order:

  • Large data literals in queries (INSERT ... VALUES ...). This puts your data through the query compiler on the leader node. This is not what it is design to do and is very expensive for the leader. Use the COPY command to bring data into the cluster. (Just bad, don't do this)
  • Over use of COMMIT. A commits cause an update to the coherent state of the database and needs to run through the "commit queue" and creates work for the leader and the compute nodes. Having COMMITs every other statement can cause this queue to back up and work to generally back up.
  • Too many slots defined in the WLM. Redshift can typically only efficiently run between 1 and 2 dozen queries at once. Setting the total slot count very high (like 50) can lead to very inefficient operation and high CPU loads. Depending on workload this can show up for compute or occasionally the lead node.
  • Large data output through SELECT statements. SELECTs return data but when this data is many GBs in size the management of this data movement (and sorting) is done by the leader node. If large amounts of data need to be extracted from Redshift it should be done with an UNLOAD statement.
  • Overuse of large cursors. Cursors can be an important tool and needed for many BI tools but cursors are located on the leader and overuse can lead to reduced leader attention on other tasks.
  • Many / large UNLOADs with parallel off. UNLOADs generally come from the compute nodes straight to S3 but with "parallel off" all the data is routed to the leader node where it is combined (sorted) and sent to S3.

While none of the above of problems in and of themselves, it is when these are overused, used in ways they are not intended, or all at once that the leader starts to be impacted. It also comes down to what you intend to do with your cluster - if it support BI tools then you may have a lot of cursors but this load on the leader is part of the cluster's intent. Issue often arise when the cluster's intent is to all things to everybody.

If your workload for Redshift is leader function heavy and you are efficiently using the leader node (no large literals, using COPY and UNLOAD, etc.) then high leader workload is what you want. You're getting the most out of the critical resource. However, most use Redshift to perform analytics on large data which is the function of the compute nodes. A highly loaded leader can detract significantly from this mission and needs to be addressed.

Another way that leader can get stressed is when clusters are configured with many smaller node types instead of fewer bigger nodes. Since the leader is the same size as the compute nodes many smaller nodes means you have a small leader doing the work. Something to consider but I'd make sure you don't have unneeded leader node stressers before investing in a resize.

2 of 2
1

Whenever you execute some commands which require calculation on the leader node, whether for dispatching data, computing statistics, or aggregating results from the workers, like COPY, UNLOAD, VACUUM, ANALYZE, you'll see an increase in CPU usage. More information about this here: https://docs.aws.amazon.com/redshift/latest/dg/c_high_level_system_architecture.html

🌐
AWS re:Post
repost.aws › knowledge-center › redshift-high-cpu
Troubleshoot high CPU usage on Amazon Redshift's leader node | AWS re:Post
October 11, 2022 - Amazon Redshift generates and compiles code for each query execution plan. Query compilation and recompilation are resource-intensive operations, and this can result in high CPU usage of the leader node.
Discussions

High CPU Utilization on Redshift Leader Node Despite No Active Queries.
Hello AWS Community, I'm experiencing an issue with my Amazon Redshift cluster where the leader node is consistently showing 99-100% CPU utilization, while the compute nodes remain below 30%. This... More on repost.aws
🌐 repost.aws
1
0
April 21, 2024
Redshift problems with sigma?
My guess would be it’s an issue with either your WLM configuration, or the queries are somehow bogging down the leader node as they aren’t efficient for Redshift. A leader at 100% for a month means someone is doing something wrong. Without providing queries or metrics it’d be tough to get actual help other than examples, but with some googling you’ll find a ton of answers like in this thread: https://stackoverflow.com/a/70217381 More on reddit.com
🌐 r/aws
2
2
March 13, 2024
High CPU usage of one Redshift node (not leader). How understand what is causing this imbalance?
Hi there, I have a problem that I can't solve yet. One node has high CPU load almost all the time, but I can't find any significant skew in data storage. What could it be? Is it possible to track ... More on repost.aws
🌐 repost.aws
1
0
October 11, 2024
Running Redshift at Scale
On using ra3, I agree generally, but for cost would recommend Refshift Serverless in Dev/QA for cost reasons unless you have steady workloads there · On CPU, even 1 query will cause CPU to hit 100% so I don’t consider it that helpful a metric on a well used cluster. More on news.ycombinator.com
🌐 news.ycombinator.com
7
32
November 18, 2023
🌐
AWS re:Post
repost.aws › knowledge-center › redshift-high-cpu-usage
Troubleshoot high CPU usage in Amazon Redshift | AWS re:Post
April 27, 2022 - Review your Redshift cluster workload. Maintain your data hygiene. Update your table design. Check for maintenance updates. Check for spikes in your leader node CPU usage. Use Amazon CloudWatch to monitor spikes in CPU utilization. ... An increased workload (due to more queries running). The increase in workload increases the number of database connections, causing higher query concurrency.
🌐
AWS
docs.aws.amazon.com › amazon redshift › management guide › amazon redshift provisioned clusters › monitoring amazon redshift cluster performance › viewing performance data › viewing cluster performance data
Viewing cluster performance data - Amazon Redshift
The following examples show some of the graphs that are displayed in the new Amazon Redshift console. CPU utilization – Shows the percentage of CPU utilization for all nodes (leader and compute).
🌐
AWS re:Post
repost.aws › questions › QU6TudAtMOSlasnDrHeuu-mA › high-cpu-utilization-on-redshift-leader-node-despite-no-active-queries
High CPU Utilization on Redshift Leader Node Despite No Active Queries. | AWS re:Post
April 21, 2024 - I'm experiencing an issue with my Amazon Redshift cluster where the leader node is consistently showing 99-100% CPU utilization, while the compute nodes remain below 30%. This issue has persisted for over 8 hours and began around the time of an automatic cluster restart during a maintenance window at 3:30 AM. Despite pausing all external data ingestion and manually restarting the cluster, the high CPU usage continues with no active user queries running.
🌐
Medium
medium.com › @israel.jerome › overcoming-aws-redshift-leader-node-bottleneck-strategies-for-enhanced-write-performance-b7c2304cdcc0
Overcoming AWS Redshift Leader Node Bottleneck: Strategies for Enhanced Write Performance | by Jerome Israel | Medium
August 10, 2023 - Common symptoms of a leader node bottleneck include slower query execution times and increased query commit queues. ... Redshift uses the Single Commit Queue Architecture to manage writes, handle query coordination and optimization in a distributed ...
🌐
Reddit
reddit.com › r/aws › redshift problems with sigma?
r/aws on Reddit: Redshift problems with sigma?
March 13, 2024 -

I have inherited a redshift DW that is used by another team via sigma for data stuff. I noticed today that the leader node has been at 100% cpu for at least a month. sure enough, sigma is running crazy queries all day that take several minutes to execute. the 4 compute nodes hover at around 5%. These are all dc2.large. I'm a software engineer and not a database guy, so this stuff isn't my strong suit. But from what I see in the documentation, queries will only be executed on the compute nodes if the nodes contain data relevant to the query (?). So other than the usual suspects (indices, bad queries, etc.), could this have something to do with whatever strategy is being used to replicate data to the compute nodes? Can we control that with redshift? Any insights greatly appreciated.

🌐
1 Billion Technology
1billiontech.com › blog_AWS_Redshift_optimization.php
AWS Redshift Optimization – A Case Study
When a request comes to the leader node, it parses the query and generates an execution plan and a compiled code to be executed in the compute nodes. The compute nodes process the incoming requests in parallel. Each compute node has a dedicated CPU, memory and a storage. Each compute node can scale out/in and scale up/down (resizing the Redshift cluster).
Find elsewhere
🌐
YouTube
youtube.com › watch
Understanding the Main Causes for High CPU Usage on Leader Nodes in Amazon Redshift - YouTube
Discover the key factors leading to high CPU usage on Amazon Redshift leader nodes and learn practical solutions to optimize your cluster performance.---This...
Published   March 31, 2025
Views   2
🌐
Medium
medium.com › @KuldeepsinhVaghela › amazon-redshift-architecture-explained-leader-node-compute-nodes-and-performance-tuning-197ec98c6e7a
Amazon Redshift Architecture Explained: Leader Node, Compute Nodes, and Performance Tuning | by Kuldeepsinh Vaghela | Medium
April 24, 2025 - Amazon Redshift offers different node types, optimized for different workloads: Dense Compute (DC) Nodes: These nodes are designed for compute-intensive workloads with smaller data volumes, utilizing fast CPUs and SSDs for high performance.
🌐
AWS re:Post
repost.aws › questions › QUyalUXnVeQVGZ15sGD0gmmQ › high-cpu-usage-of-one-redshift-node-not-leader-how-understand-what-is-causing-this-imbalance
High CPU usage of one Redshift node (not leader). How understand what is causing this imbalance? | AWS re:Post
October 11, 2024 - Examine data distribution: Although you mentioned not finding significant skew in data storage, it's worth double-checking the data distribution across nodes. Run a query to identify tables with data skew or unsorted rows in your Redshift cluster. This can help pinpoint if certain tables are causing uneven workload distribution. Investigate longest-running queries: Use a diagnostic query to identify the longest-running queries in your cluster. This can help you pinpoint specific queries that might be causing the high CPU usage on the affected node.
🌐
Hacker News
news.ycombinator.com › item
Running Redshift at Scale | Hacker News
November 18, 2023 - On using ra3, I agree generally, but for cost would recommend Refshift Serverless in Dev/QA for cost reasons unless you have steady workloads there · On CPU, even 1 query will cause CPU to hit 100% so I don’t consider it that helpful a metric on a well used cluster.
🌐
Amazon Web Services
docs.amazonaws.cn › 亚马逊云科技 › amazon redshift › management guide › amazon redshift provisioned clusters › monitoring amazon redshift cluster performance › performance data in amazon redshift
Performance data in Amazon Redshift - Amazon Redshift
Amazon Redshift has the following two dimensions: Metrics that have a NodeID dimension are metrics that provide performance data for nodes of a cluster. This set of metrics includes leader and compute nodes. Examples of these metrics include CPUUtilization, ReadIOPS, WriteIOPS.
🌐
Artie
artie.com › blogs › best-practices-on-running-redshift-at-scale
Best Practices on Running Redshift at Scale
November 15, 2023 - Resize the cluster by adding more nodes or upgrading to a more powerful node type. Set up alerts to notify you when CPU utilization exceeds a threshold so you can take proactive steps. Use workload management (WLM) to prioritize workloads better such that fast running queries are not backlogged ...
🌐
AllCloud
allcloud.io › home › blog › 5 areas to consider for running an optimized redshift-based cloud data warehouse
5 Areas to Consider for Running an Optimized Redshift-Based Cloud Data Warehouse | AllCloud
March 24, 2020 - In addition, Redshift supports multi-node clusters so when your requirement grows, you can scale by just adding a node. You also get a leader node compute engine without any extra cost when you run your data warehouse in a multi-node cluster. Storage of each node is used for storing data in a distributed fashion to achieve high degree of parallel processing.
Top answer
1 of 3
1
In an Amazon Redshift cluster, the leader node type is not explicitly specified or chosen separately from the compute nodes. When you set up a Redshift cluster, you specify the node type (such as RA3 or DC2) and this applies to the compute nodes. The leader node is automatically provisioned as part of the cluster, but its specifications are not directly tied to the compute node type you select. The leader node manages communications with client applications, parses queries, develops query execution plans, and coordinates the parallel execution of these plans with the compute nodes. It also aggregates intermediate results from compute nodes before returning final results to client applications. Unlike compute nodes, where you can specify the number and type (like your example of 3 RA3.xlplus nodes), the leader node is provided automatically as part of the cluster architecture. The documentation doesn't specify that the leader node is of the same type as your compute nodes. The leader node's resources are managed by Amazon Redshift to handle its coordination responsibilities, regardless of the compute node type you choose for your cluster. **Sources** Amazon Redshift provisioned clusters - Amazon Redshift Data warehouse system architecture - Amazon Redshift Architecture components of an Amazon Redshift data warehouse - AWS Prescriptive Guidance
2 of 3
0
Based on the AWS documentation, in an Amazon Redshift cluster, the Leader node is handled differently than the compute nodes. When you set up a Redshift cluster with 3 ra3.xlplus compute nodes, here's what happens with the Leader node: **Leader Node Configuration:** * The Leader node is automatically provisioned and managed by Amazon Redshift * You do not directly choose or specify the Leader node type * Its specifications are determined by Amazon Redshift based on your overall cluster configuration **Leader vs. Compute Nodes:** * While your compute nodes are all ra3.xlplus in your example, the Leader node is not necessarily the same type * The Leader node's resources are automatically scaled by AWS to match the needs of your cluster * The exact specifications of the Leader node are not directly visible or configurable by users So to directly answer your question: In a cluster with 3 ra3.xlplus compute nodes, the Leader node is not necessarily an ra3.xlplus node. Its type and specifications are managed internally by AWS and are not explicitly exposed to users. The Leader node is provisioned with appropriate resources to efficiently manage your specified compute nodes. This approach allows Amazon Redshift to optimize the Leader node's capabilities based on the specific requirements of your cluster configuration without requiring you to make these technical decisions.
🌐
ResearchGate
researchgate.net › figure › Amazon-Redshift-system-architecture-The-leader-node-accepts-connections-from-client_fig3_300581416
Amazon Redshift system architecture The leader node accepts connections... | Download Scientific Diagram
Experimental results across three high-performance engines on a real-world workload show consistent performance gains enabled by the proposed algebraic optimization layer. ... ... Many open-source and commercial distributed data-bases, such as Greenplum [27], Alibaba AnalyticDB [47], and Amazon Redshift · [15], all follow the massively parallel processing (MPP) architecture. In this setting, the distributed databases consist of one master node and multiple segment nodes, where each segment node maintains part of the data.
🌐
Stack Overflow
stackoverflow.com › questions › 50613663 › redshift-leader-node-using-up-100-of-disk
database - redshift leader node using up 100% of disk - Stack Overflow
we have a 50 node redshift cluster, and we run vacuum periodically. and currently we are running a pipeline where we are moving some data onto S3 and deleting it from redshift. after about 2 weeks of processing. our disk usage on 49 nodes ( except leader ) came down from 95% to 80%. but the disk usage on leader went up and its now at 100%.