After I had been working with the Curve Optimizer for some time. I read the guide from katalysis (Guide: Zen 3 Overclocking using Curve Optimizer (PBO 2.0) : Amd (reddit.com)) and was very confident that I had learned about the somewhat unusual test of forcing "Windows 10 Automatic Repair and Diagnosis" for ten times.
So I tested my previously determined CO values and passed the test.
But where did the sometimes rare and sometimes frequent BSOD come from? It usually looked like that, e.g. I ran a Realbench test for 8 hours overnight. In the morning I come to the PC to check and the last 10 minutes of the test are still running. I sit down with a coffee and wait for the test to end.
The test runs through to the end. I am happy that I finally found stable values and at the moment when I close the Realbench window ... BSOD!
F ** k !!!!!
Or it happened while browsing or other lighter CPU loads.Sometimes the system was completely stable for 2-3 days, sometimes the BSOD suddenly came 10 seconds after Windows booted.I tried so many different stress testing tools. All passed successfully.It was clear to me that my CO values were not stable. But how can I detect this instability?
Chapter I – Long story short
For the more experienced users, I'll summarize the two essential points at the beginning. So you don't have to work your way up this wall of text.
After a lot of tests with just about every stress test tool out there, I ended up back at Prime95. I discovered that when testing with large FFTs and non AVX instead of small FFTs and AVX, the CO values which are stable are much lower. This results in a lot more stability. But I didn't want 95% stability, I wanted at least 99%. So I further refined the test methods and made the following settings:
-
Tests a core with one thread by assigning the affinity with the task manager under Prime95 with large FFTs non AVX for stability. A nice side effect is that instead of waiting for the end of the test or a BSOD in my case, the worker stops almost immediately if the CO values are too high.
-
I finally (!!!!) found a test that lets the cores boost to almost their maximum frequency. It is the Aida64 memory stress test. Again using the task manager on a certain core you can explore the stability under very light workloads and find meaningful values for the boost override.
In addition, I recommend trying the points listed under Chapter III - preparation.
I tested all of this with a 5900X and an MSI X570 Tomahawk. And thanks to my friend, who has a 5800X and a Gigabyte B550 Aorus Pro AC, I was able to test the whole thing in this configuration. So both cpu once on the Gigabyte board and once on the MSI.
For the less experienced user, here is a step-by-step guide.
Enough talked - let's get started
Chapter II - What you need:
-
HWiNFO64 (Free Download HWiNFO Sofware | Installer & Portable for Windows, DOS)
-
Prime95 (Prime95 (30.4 Build 5) Download | TechPowerUp)
-
In case you already got prime95 – please make sure you have the latest version which is atm 30.4)
-
Aida64 Extreme (Downloads | AIDA64)
-
Later for longterm testing: Curve Optimizer per-core stability test tool (Curve Optimizer per-core stability test tool : Amd (reddit.com))
Chapter III – Preparation
First download the latest BIOS for your motherboard and flash it. Also update your chipset and other drivers. Next perform a bios reset after you have saved your current settings in a profile.
In addition, we will adjust a few points to rule out possible errors, so that we can fully concentrate on the cpu.
-
In order to exclude RAM instabilities at the beginning I recommend not to load the XMP profile and to set everything else in the bios on auto.
-
Some RAM kits (especially the higher clocked kits) are a bit tricky when they are operated with the standard settings. Especially with the voltage of 1.20, some people can't handle it. So we manually set the RAM voltage to the value of the XMP profile.
-
I've had BSOD with every setting on Auto. So basically out of the box settings. I could tell it was due to the voltage of the IO die. This was e.g. at FCLK from 1066 Mhz to about 0.91 (according to hwinfo). This did not result in BSOD but the PC simply restarted without comment. That's why we set the SOC voltage to 1.05 or 1.1 V.
-
There have been reports that BSOD can occur if the PCIe settings are left on Auto (and thus Gen4). Even if I haven't been able to determine this so far, I recommend setting the whole thing to Gen3.
-
In addition, I read a lot about crashes under idle conditions and had one or the other experience with it. In my case it helped to deactivate the global C-states and to set the idle current to typical. In addition, it helped some here to set the minimum processor load under the Windows energy saving plan to 50% or even 100%. (Was not necessary with my config).
Chapter IV – Determine the CO values + boost override for the two best cores
I will describe the whole thing using the example of a 5800X on an MSI X570 Tomahawk. (Less cores = less to write!)
In the AMD Overclocking menu, set the PBO mode to Advanced, then the PBO Limits to Mainboard (with a 5800X on an MSI X570 Tomahawk, I recommend leaving it on Auto) and the Boost Override to +200 Mhz (or more if your motherboard is able to). In the Curve Optimizer menu, set your two best cores (HWinfo perf 1/1 + 1/2) to negative 5 and boot into windows.In the event that this is already too much for your cpu, try lowering the values in CO or lower the boost override by 25 Mhz and try again.
Back in windows start Prime95 and the task manager.Start a torture test in Prime with one thread and Large FFTs with both AVX options disabled.
Windows will now push this one thread back and forth between Core perf 1/1 and Core perf 1/2, which can produce an unclear result. That's why we force Prime95 to use a certain core using the task manager.
In the task manager under details search for Prime95.exe and right click it. Select set affinity. A new window will open. This shows your processor cores. Both the physical and the logical cores.
It is important here that CPU 1 and CPU 2 are assigned to core 1 (or as referred to in the BIOS or hwinfo core 0).The 5800X I tested has its perf 1/1 core on core 5 and core 1 is the perf 1/2. So I have to select 10 in the task manager for the perf 1/1 core and core 4 for the perf 1/2 core. Got it? : D You can use the core load in hwinfo to determine whether you have hit the right core.
In my experience, the further away the values are from your stable setting, the faster the worker will stop. For example, your stable value is 10 and you test with 15, which results in an immediate worker stop for me. If, on the other hand, you test at 11, it can take a minute for the worker to get out. For this reason I recommend running the test for both cores for at least 2-3 minutes. We will come to the long-term stability later. This should be enough to test the current values.
If it doesent stop, repeat the procedure until you can either no longer boot into windows, get a BSOD or the worker stops.
If you have now determined the values for the two best cores (which can be different for both cores), we can go one step further. With Prime95, your two best cores will boost to a certain clock speed which, however, will still be a long way away from your possible boost clock. Using the 5800X as an example, I was able to stay at +200 Mhz. The maximum boost stock is 4850 Mhz. +200 Mhz will result in 5050 Mhz. So we need a constant workload to let the cpu boost to its max. This is where Aida64 comes in.
After starting Aida64, select the "stability stress test" mode in the "tools" menu. Open the task manager again and go to details. Now select "stress system memory" at Aida64 and click on start. Next, force Aida64 to test a certain core using the task manager. Use it to test your two best cores.
Aida64 memory stress test is a very light workload. So the cores will boost to the maximum.Check the clock speeds in hwinfo (Effective Clock!) For your two best cores. When both cores almost reach their maximum clock speed, you can leave the boost override as you have currently selected. Again the 5800X: The cores constantly reach 5030-5040 MHz. If one or both cores do not reach the maximum, I recommend reducing the setting for the boost override. In my case, this reliably prevents bluescreens @ very light workloads when one core is boosting above its stable limit (even if it only happens for a fraction of a second). In the case of my 5900X (4950 base clock +200 Mhz results in a possible clock speed of 5150 Mhz) one core reached 5120 Mhz and the other "only" 5090 Mhz. So I reduced it to +150 Mhz so that both cores then reached around 5080-5075 Mhz. And bye bye random BSOD!!!
I recommend running the Aida64 test for 15 minutes per core. I've already seen an error message from Aida on my 5900X because the boost override was set too high. Likewise with the 5800X which I could set to +300 on the Gigabyte B550 board. (Still does not work with the MSI board ...).
Chapter V – Determine the CO values for the rest
In principle, the search for the maximum values for the remaining cores proceeds according to the same principle mentioned above.
For example, if you have reached a value of -20 for a certain core, pay attention to the maximum frequency of this core when testing with Prime95 and Aida64. At a certain point, the clock rate will no longer increase under Prime95, then there is nothing to go lower than -20. Especially when the core is already running at the maximum frequency of, for example, 5030-5050 MHz of a 5800X under Aida64. This only leads to further possibilities of instabilities under certain workloads. Only reduce as much as necessary, not as much as possible!
Chapter VI – Longterm stability testing
Now we take care of the stability testing of the whole and make sure that the settings are stable for a long time.
The whole testing with Prime95 is quite nice, but also quite time consuming and a bit annoying. To get around that, I found an awesome script here on reddit, the curve optimzer per-core stability test tool. Please give the author an upvote!!!
This script automatically changes the affinity of the Prime95 workload and is therefore perfectly suited to the individual cores e.g. test overnight. We will now configure the stability test tool.
After downloading and unpacking (of course including Python 3.9 as mentioned in the author's post) you only need to open the main.py file with the notepad or editor. (Right Click - Open with ...) Now you change the entry in the box called "thread_num" surrounded by # to your corresponding number of threads. I also recommend a value of 150 (= 2.5 minutes) for "sec_between_switch" when testing overnight. Thus, each core is loaded for a total of 5 minutes. After you have changed the two entries, you can first start Prime95 (with the recommended settings). If you then want to check the result after time X and see that a worker has stopped, you can easily find out which core it was. Go to the thread switcher folder and open the log.txt. Also go to the Prime95 directory and open the results.txt. In the results.txt you will see the following entry at the bottom: Fatal Error and so on. Pay attention to the time stamp. Compare this with the entries in the log.txt from the Thread Switcher folder. With this you can determine which core it was. Reduces the CO value of this core accordingly.
Unfortunately, the whole thing has not yet works with the Aida64 memory stress test (Access denied) but I am working on it. Maybe someone from you has an idea ...?
And finaly, if you think that you have reached the maximum stable CO values, the old overclocking rule comes into play: Find the maximum that is stable and turn it down by a notch.
For this reason I have reduced the values Ive found by 1. You never know... I haven't had a single BSOD / restart or freeze since then.
The points that I mentioned under Chapter II - Preparations can be activated / changed again after the CO tuning has been completed. Just test whether it remains stable!
Next up is an overview that shows the temperature scaling of the Zen3 cpus (5800X and 5900X with different cooling solutions), stay tuned
How to test stability for curve optimizer quickly.
Help diagnosing instability w/ PBO2 curve optimizer
Advanced Guide: Curve Optimizer, Stability test and some fixes (Best guide for PBO 2, better then CTR2)
" Especially with the voltage of 1.20, some people can't handle it "...kinda lost credibility right there. RAM voltage is usually 1.35, not 1.2. No one uses their DDR4 at 2133. What do you even mean some people can't handle it.
More on reddit.comPBO 2 curve optimizer stability test | guru3D Forums
Videos
I am able to get a "stable" curve optimiser during load, but not at idle.
The settings were:
Core 4: -24 (unstable during load beyond -24)
The rest of the cores: -27 (unstable load beyond -27)
Note: I used 'Hardwear Info' to monitor which cores were stretching with each increasing negative curve settings.
With these settings, I can post, conduct stress tests, and all would "pass". But if I leave my PC from a fresh boot and idle from anytime between 5min to 1hr, it would restart.
I don't want to wait 1 or more hours for each core to find stability. So now all cores are set at -23 which is stable throughout.
Are there ways to stabilise curve optimiser at idle? Do I need to play with LLC? Could it be due to Vdroop?
I'm not a pro or anything, I literally only been learning how to tune PC's for the AM5 platform only for the past 2 weeks.
Specs:
CPU: AMD 7950x
Mobo: Aorus x670e extreme
RAM : Corsair Dominator Platinum CL36 32Gb 6000MHz
AIO: Deepcool LT720 Premium Liquid
GPU: MSI rtx 2070 (next to be upgrade)
Bootdrive: Intel Optain 900P
I wuld like to add, that I am able to get Cinebench R23 scores (multi core) of around 39.5k with an average of 39087.
Core speeds: 5245-5230MHz (constant full load, not one offs)
Idle temp: 25°C
Saturated load temp: 92.5°C
Ambient temp: 21°C
PPT: 218.7
TDC: 156.6
EDC: 200
Infinite Fabric: 2200
All core curve optimiser -23
I have calculated I could get Cinebench R23 scores of low 40k at ambient temps of 16°C.
Thank you everyone for any help and knowledge.
Hello, i have a ryzen 5 5600, for a few months i was using all core -25 on pbo, but recently I've upgraded my cooling solution from stock to an arctic liquid freezer iii 240, and wanted to put it to good use by overclocking the cpu. While at it i figured I'd try to push the single core curve to the best possible value, but I'm unsure of how to do so. My experience in the past (without +200mhz overclock) was : system reboot after about 1 day on all core -30 system reboot after about 3 days on -29 system reboot after about 6 days on -28
2 days ago i applied the +200mhz boost overclock and set the pbo curve to all core -26. After 1 day i experienced a system reboot but the circumstances were weird. I was playing genshin impact (using an fps unlocker to bypass the default 60, but locking my fps through radeon chill to 120), noticed unusual amounts of stuttering. I quickly notice my cpu usage was at 100%, opening task manager i see Steelseries GG (program i use for taking clips) was using over 60%, closing it immediately resolved the issue. Upon reopening, the cpu usage was back to normal but after a few seconds, the system reboot happened.
Later I started testing using OCCT version 13.0.0, setting curve optimizer to per core, starting with all cores at -30, but keeping Steelseries GG closed. Ive tested for about 3 hours with various configurations, core cycling, all core, different instruction sets, cpu , cpu+ram but didn't get a single error or system reboot. Could my silicon actually be good enough to handle all core -30 and a +200mhz overclock, and the cause of instability was the steelseries gg application? If not, what other benchmarks/tools can i use to find the most ideal curve? Thanks.
I'm attempting to tune my 9800x3D via Curve Optimizer. I've been using AIDA64 stressing CPU, FPU, cache, and memory. My methodology has been to run the test for ~3 hours, and if it passes, bump each core negative offset one at a time by -5.
I think I almost have something stable dialed in, but AIDA64 is now failing at the 15-hour mark. Not really sure how to go about isolating which core needs adjusting. I've also tried running CoreCycler 0.10.0.0 with all the default settings and it ran for over 32 iterations over 24 hours passing everything.
So far, AIDA64 has been faster at telling me something is unstable whereas CoreCycler has never thrown any errors, and I've read somewhere that it's not great for testing stability on the 9800x3D.
I know there are other tools like Prime95, y-cruncher, and OCCT; which one can inform me of system instability faster and tell me which core is failing? Or are there certain CoreCycler settings I should be using?
In terms of every day usage, I haven't noticed any crashes. The only abnormality I noticed has been that the system is sometimes unable to POST following a reboot with a yellow DRAM LED indicator. I have a hunch that this is related to a +200 boost clock override rather than Curve Optimizer, but not sure.
Edit: Also just ran OCCT for 1 hour cycling through all cores as well as the new CoreCycler 0.11.0.0alpha using the automatic test mode / y-Cruncher Kagari preset; it passed all without errors.
Specs:
CPU: AMD Ryzen 7 9800x3D
RAM: CORSAIR Dominator Titanium 32GB DDR5 6000 MHz
Motherboard: ASUS ROG Strix x870-A Gaming WiFi BIOS version 1203
BIOS Settings:
Ai Overclock Tuner: DOCP Tweaked
Tcl: 30
Trcd: 36
Trp: 36
Tras: 76
DRAM VDD Voltage: 1.40000
DRAM VDDQ Voltage: 1.40000
PMIC Voltages: Sync All PMICs
(Confirmed all of the above were stable prior to attempting to mess with PBO + Curve Optimizer)
Precision Boost Overdrive: Advanced
PBO Limits: Motherboard
Precision Boost Overdrive Scalar Ctrl: Manual
Precision Boost Overdrive Scalar: 1x
CPU Boost Clock Override: Enabled (Positive)
Max CPU Boost Clock Override(+): 200
Platform Thermal Throttle Limit: 80C
Curve Optimizer: Per Core
Core 0: -45
Core 1: -35
Core 2: -40
Core 3: -20
Core 4: -20
Core 5: -30
Core 6: -15
Core 7: -15
I have 5600x and trying to tune Curve Optimizer for few days. Settings which passed 30 loops of per-core-stability (non-AVX) instantly crashed on two cores when i tried Prime95 (AVX). Increased the voltage on those cores and passed 10 loops per-core-stability , 2 hours of Prime95 AVX with no errors.
Played some games like Warzone , Battlefield V everything is good but crashed at League of Legends lol.
How do you guys really test Curve Optimizer without waiting for a crash to happen ?
For everyone messing around with PBO2 Curve Optimiser to undervolt & OC their CPU's, here is a stability test I have found useful. OCCT using Small Data Set, SSE instructions, 1T. This will find bad offsets usually within seconds, but I would let it run for 5min before considering a setting potentially stable.
Your OCCT settings should look like this: https://ibb.co/56fbrQm
When I was first testing I was trying things like -20 all core offset, and it was passing regular stress tests like OCCT all-core for 30+ minutes. OCCT small 1T would fail instantly. Because the curve optimiser is reducing by more voltage at lower workloads we need a less-heavy stress test to engage those conditions; this seems to do the job quite well. Note that some people currently have issues with OCCT causing WHEA errors so you may need to set your ram to 3200 or lower while using this method to validate your curve offset.
Currently I have my 5600x running at -45,-20,-15,-20,-7,-7, +175mhz, auto scalar. which passes 6min OCCT 1t, but I need to test it a bit more. The -45 is suspiciously high. CPU-Z single core is around 666, cinebench r20 is 614. Stock PPT/EDC/TDC.
If anyone has any other tips for testing/validating curve optimiser OC's let me know.
EDIT: Thanks to everyone pointing out that you should set OCCT affinity to the specific core you want to test. The process should look like:
-
Set a baseline undervolt for your core. You could start at the highest value (30) for completeness.
-
Test cores in sequential order in OCCT by setting its affinity to that core. (Task Manager -> Details -> Affinity). Do that before beginning the test, if it doesn't crash in 3-5min consider that good enough for now.
-
If that core passes, test next core until a failure. Use Task Manager to verify the affinity is working.
-
Upon failure, lower your undervolt and retest until all pass.