Original Link: https://www.anandtech.com/show/12454/analyzing-threadripper-cooling-big-base-cooling-wins
Analyzing Threadripper Thermals: Big Base Cooling Wins
by E. Fylladitakis on March 14, 2018 8:30 AM EST- Posted in
- CPUs
- AMD
- Cases/Cooling/PSUs
- Noctua
- ThreadRipper
- 1950X
- CPU cooler
Ever since their launch last year, AMD’s Threadripper CPUs have been the center of many discussions and debates. Due to their unique design – both physically and architecturally – cooling requirements and efficiency are some of the major discussion topics. All of the Threadripper models have significant power and cooling requirements, with AMD recommending liquid cooling right off the bat. However, the actual thermal design power (TDP) specifications are not that high, suggesting that a good air-based cooler could easily cope with the thermal load. And here is where things are getting complicated.
Threadripper processors step quite far outside typical CPU designs in several ways, one of which is their relatively massive physical size. The CPU’s surface area is much greater than that of all consumer CPUs before them, including Intel's LGA 20xx sockets. This sizable design choice is not because AMD couldn't squeeze the CPU dies physically closer, but because Threadripper's size is the minimum size that their engineers calculated to be effective for both the mechanical strength of the package and for sufficient heat dissipation. When Threadripper was announced, nearly all cooler manufacturers rushed to provide adapters for their products to be mounted on Threadripper processors. AMD themselves include an adapter for Asetek-based liquid coolers inside the package of the Threadripper processors. User experiences with such adapters, including our own, were less that ideal. So today we're going to take a look at why AMD's thermal requirements are so exaggerated and showcase why adapters are not effective.
AMD Threadripper Coolers: Almost the Same, But Not Quite
Most available coolers were designed with the previous generation processors in mind, and their contact surface is significantly smaller than the CPU’s surface area. Many manufacturers rushed to offer adapters for their products to be mounted on Threadripper (socket TR4) processors and several posted/presented simple mods that convert AM4-compatible coolers to fit as well. Using socket adapters and mounting typical coolers on Threadripper processor does work, but they are only partially covering the CPU’s surface area.
Left - Noctua NH-U14S with AM4-UxS mounting braces, Right - Noctua NH-U14S TR4-SP3
There have been many debates whether the same exact cooler but with a proper contact surface would have better thermal conductance. According to basic thermal engineering theory, the cooler whose base covers more of a processor's lid should perform better. In the following pages we examine the theoretical aspect of this supposition and follow up with an experimental case study.
Thermal Conductance 101 with Dr. Fylladitakis
Simplified, thermal conductance is the ability of a material or arrangement to conduct heat. Thermal conductance is inversely related to the absolute thermal resistance, meaning that a lower absolute thermal resistance will improve thermal conductance. In our particular case study, that arrangement is the CPU/Cooler setup. When studying arrangements, the absolute thermal resistance of the entire arrangement is the sum of the thermal resistance that each individual part has. Anything that lowers the absolute thermal resistance of the CPU/Cooler arrangement will result to better thermal conductance, i.e. lower operating temperatures. Vice versa, if the absolute thermal resistance of the arrangement increases, the thermal conductance will be lowered.
The figure above displays a simplified CPU/Cooler arrangement. At the bottom layer we have our CPU die(s), the middle layer is the CPU’s copper lid and, finally, the top layer is the CPU cooler. With such a setup, we have three individual thermal resistances: R1) the CPU’s die (heat conduction from the CPU die to the CPU lid), R2) the CPU’s lid (heat conduction from the CPU’s lid to the cooler), and R3) the cooler’s absolute thermal resistance. These three resistances sum up to make the total absolute thermal resistance of the CPU/Cooler arrangement. Simplified and assuming one-dimensional conduction, each absolute thermal resistance depends on three things: a) the length of the material in parallel to the heat flow (i.e. its thickness), b) the thermal conductivity of the material, and c) the cross-sectional contact area.
The cooler’s absolute thermal resistance obviously depends on the cooler itself (size, materials, design, air flow, etc.). However, no matter how good a cooler is, the absolute thermal resistance of the entire arrangement can still be poor if any other thermal resistance is too high. For example, it is known amongst enthusiasts that some of Intel’s previous processor generations had poor thermal performance that was unjustified given their very low power requirements. That was because the CPU’s lid was making poor contact with the CPU’s die(s), greatly decreasing the thermal flow between the die and the lid. Hardcore enthusiasts were “delidding” their processors in order to fix this issue.
AMD Threadripper Processors & Cooler Compatibility
In the case of AMD’s Threadripper processors, it is the third part of the CPU/Cooler arrangement that causes concerns, i.e. the coolers themselves. As we mentioned earlier, one of the major factors that affects the thermal resistance of each individual part is the cross-sectional contact area. As an extreme example, even if we could devise an ideal cooling body that has zero thermal resistance to place upon a copper contact plate, it would be ineffective if the contact plate is reduced to the size of a pinhead.
In the above figure, we split a cooler into two parts for a quick, simplified model. The first part is the contact plate and the second is the rest of the cooler. In that case, the cooler’s thermal resistance (R3) is split into two individual resistances (Rcp and Rc). If nothing were to change from before, R3 would be equal to Rcp + Rc. However, via simple math and considering heat conduction via air to be negligible, it can be derived that reducing the cross-sectional surface area of the contact plate will significantly increase its absolute thermal resistance and, in turn, the thermal resistance of the entire arrangement.
Consequently, assuming that we have the exact same CPU, lid, and cooling body, an undersized cooler’s contact plate that does not make full contact with the lid will significantly increase the thermal resistance of the whole arrangement. Furthermore, no matter how good absolute thermal resistance the cooler’s main body may have, even if we were to consider a hypothetical ideal cooler with no thermal resistance at all, the rest of the resistances are still being added to the sum that governs the total thermal conductance of the arrangement. The high thermal resistance caused by an undersized contact plate cannot be easily countered, which is why the thermal conductance of the cooler itself needs to be very high, i.e. why AMD’s cooler recommendations are so over the top.
Our previous calculations are representative of the real-world phenomena but also were based on our greatly simplified assumptions, assuming uniform heat transfer and one-dimensional conduction. A trained engineer easily realizes that real-world issues with coolers that are making only partial contact with a CPU’s lid are numerous and that they extend far beyond simple everyday thermal performance. For example, the absolute thermal resistance between the CPU dies that lie away from where the cooler is contacting the lid and the cooler’s base is manyfold greater. That is because the two primary geometric factors that affect thermal resistance, the length of the material in parallel to the heat flow and it's the cross-sectional contact area, are now greatly different.
The above figure displays two cases. In the bottom case, the contact plate of the cooler is not sufficiently large to cover the entire CPU lid, leaving part of the CPU’s dies outside the “effective” heat transfer cross-sections. This does not mean that the rest of the die does not receive any cooling - the CPU's lid will keep absorbing energy from the whole die, but the heat flow ceases to be simple one-dimensional and the distribution will not be uniform, resulting to higher temperatures near the edges of the lid and, in extend, the outermost parts of the die.
The top case is a more extreme example and assumes that a die will be left entirely outside the surface where the cooler's contact plate will be. In this case, there is no “effective” heat transfer cross-section at all and it is practically impossible to properly cool the device. Again, simplifying and solving the problem as one-dimensional is possible by splitting the process into numerous steps and taking one step at the time. We skip most of these steps and go right to the important one. As it can be seen in the above figure, the thermal resistance of the "pathway" created between the die and the contact plate is being added to the total absolute thermal resistance of the arrangement. However, the cross-sectional area of this "pathway" is very small because the thickness of the lid is comparatively tiny. This results to a massively high resistance (Rp) that will be added to the sum and cannot be effectively countered no matter how good the cooler may be.
The above two examples show that any die which is partially covered or, far worse, not covered at all by the cooler’s contact surface, may be prone to overheating and reliability issues, even if the processor’s central dies are being cooled properly, and regardless of the cooler's capabilities.
Getting Real: The Noctua NH-U14S & NH-U14S TR4-SP3
In order to translate high-level concepts into real world numbers, we need to take a look at actual Threadripper processors and coolers. To that end, Noctua courteously volunteered to help us prove our theory by shipping us one of their most popular coolers, the NH-U14S, along with a socket AM4 adapter. We've previously taken a look at the NH-U14S, and it was one of the best performing tower coolers we've reviewed, so we know from experience that this model cooler is already ahead of the pack in terms of cooling capabilities.
Despite that pedigree, Noctua is one of the very few companies that advise against mounting their older cooler designs on SP3/TR4 processors. For users that wish to move to a Threadripper platform, Noctua’s engineers have designed TR4-specific coolers. For those building new TR4 systems, Noctua developed the NH-U14S TR4-SP3. Still, mounting the original NH-U14S on a Threadripper processor using the AM4 socket braces is a frequently discussed and easy mod, as is with all AM4-compatible coolers. Note that Noctua's AM4 adapter does not make the NH-UxS coolers TR4-compatible out of the box!
The NH-U14S and the NH-U14S TR4-SP3 initially look as if they are the same cooler. The two coolers share the same 52 mm deep and 150 mm wide fin array, as well as the same six heat pipes. Each heat pipe expands to either side of the cooler, forming twelve evenly distributed thermal energy transfer lanes from the base of the cooler to fin array. Noctua nickel-plated the copper heat pipes to prevent them from corrosion.
The difference between the NH-U14S and the NH-U14S TR4-SP3 practically is only the base of the cooler. Both coolers have a nickel-plated copper base but the contact area of the NH-U14S TR4-SP3 is much greater than that of the original design. The larger base adds a little bit of weight and mass, but it would not be nearly enough to make a significant difference in the performance of the cooler if both could cover the processor's surface. As such, these two coolers are perfect for us to research just how important having a cooler that makes full contact with the processor’s lid is.
Modding the AM4 kit to fit on the TR4 cooler requires only 10 mm brackets (or just wide T-nuts and a handful of screws), which can be easily found in hardware shops, be 3D printed, or be made with a Dremel and a little bit of patience. In our case, since we already had the NH-U14S TR4-SP3 whose braces fit the size of the NH-U14S like a glove, we simply removed them from the TR4-SP3 and used them to fix the NH-U14S onto the TR4 socket.
Noctua NH-U14S mounted using the braces of NH-U14S TR4-SP3
Test Methodology
Our test system appears in the following table:
Test Setup | |
Processor | AMD Threadripper 1950X 16 Cores, 32 Threads, 3.4 GHz |
Motherboards | Gigabyte X399 Designare EX |
Cooling | Noctua NH-U14S / NH-U14S TR4-SP3 |
Power Supply | Corsair AX1200i Platinum PSU |
Memory | Corsair Vengeance LPX 4 × 8GB kit |
Memory Settings | 2666 MHz |
Video Cards | MSI GTX 770 Lightning 2GB (1150/1202 Boost) |
Hard Drive | Crucial MX200 1TB |
Case | Open Test Bed |
Operating System | Windows 10 64-bit |
For both tests, we used Noctua NT-H1 thermal grease on the processor. For the thermal grease to settle in, the processor was loaded for at least an hour, left to rest for at least two hours, loaded again, then left to rest overnight. The results were recorded on the next day. After the first tests, the CPU was thoroughly cleaned and the process was repeated for the second cooler.
By default, the motherboard is trying to control the speed of the cooler’s fans depending on the processor’s temperature, creating a variable environment. As we are trying to showcase what the difference between full and partial contact for the exact same cooling arrangement, we needed a stable environment, i.e. the fans had to be running at the exact same speed. Therefore, we are powering the fans from an external power source, maintaining stable RPM throughout our testing.
Finally, we are always monitoring and recording the Tdie temperature. That is a more accurate representation of the CPU's actual thermal state, as the Threadripper (and some Ryzen) cores report their operational temperatures with a large positive offset (+27°C) for control purposes.
Results
Starting off with idle temperatures, we're already seeing a difference between the two coolers. At all fan speeds, the larger-based TR4 model cooler is around 4C cooler than its standard counterpart. Given just how little power a Threadripper processor actually uses at idle, this is an interesting prelude of things to come.
Meanwhile under load, not only there is a clear difference on the performance of the two CPU/Cooler arrangements, but that difference is much more than merely significant. The figures that we recorded correspond to a major cooling upgrade. The standard NH-U14S can barely maintain our Threadripper 1950X functional under load, with the system throttling very heavily at all times.
With its fan's speed lowered down to 600 RPM, the NH-U14S could not handle the thermal load (hence the missing data point in the graph). Meanwhile although it is essentially the same cooler, the NH-U14S TR4-SP3 manages to maintain operational temperatures throughout all our tests, with no thermal throttling even with its fan's speed lowered down to 600 RPM. If we were examining two different coolers, such an improvement would easily differentiate a basic from an advanced air cooler. In our case, the two coolers are identical, yet the difference of the arrangement’s thermal conductance is substantial, all because of the cooler’s contact plate.
The following figure displays the CPU’s temperature over time with a script loading and unloading the processor every 10 seconds. The fans are running steady at 1200 RPM. It can be seen that the NH-U14S TR4-SP3 offers both better thermal performance than the standard version with the mounting kit, but also is more resilient to varying thermal loads.
Conclusion
AMD’s Threadripper processors certainly do not require liquid coolers to function properly at stock, even if the manufacturer recommends them. It may be that AMD had to recommend them because older cooler designs were designed for processors much smaller in size than the Threadripper and are incapable of providing adequate heat energy transfer rates, as their base does not make full contact with the processor’s lid. AMD probably foresaw that many companies will rush to offer adapters or modify their current designs to fit TR4 processors, even though their surface area was much greater than their AM4 counterparts. A partial surface contact greatly increases the thermal resistance of the whole setup, so very low resistance coolers were required to compensate for this. However, when a good air cooler is specifically designed for the TR4 socket, it can easily cope with the thermal requirements of the AMD Threadripper 1950X.
Noctua is one of the few cooler manufacturers that straightforwardly advise against the mounting of their older design on SP3/TR4 processors. As other manufacturers were openhandedly supplying adapters, this choice earned them a bit of distrust, as a few assumed that Noctua wanted to force their customers into buying new coolers. However, our findings today justify their choice and prove that Noctua did the right thing, regardless of any short-term consequences it may have had on the company's reputation.
Will using adapters on earlier cooler designs work? Yes, but their performance will be far from optimal and, depending on the size of the cooler’s surface area, they can also be dangerous for heavy load applications. High end air coolers and liquid coolers will be able to cope with the needs of the TR4 processor, but they will run hotter, or louder, or both, than their equal derivatives that were redesigned to properly cover a TR4 processor.
If you are upgrading from another platform/socket and are wondering whether to buy adapters for your cooler, our generic suggestion is “don't”. Even if you have one of the best air coolers, the performance impact is so large that even a significantly less expensive TR4-specific cooler is likely to perform better. Instead of spending time and money on adapters, just buy an appropriate cooler specifically designed for TR4 processors. When building a >$1.500 system, an extra $50 for an appropriate cooler can be easily justified. If you have a custom liquid cooling setup, just get another CPU block, one specifically designed for TR4 processors. The use of adapters makes sense only as a relatively short-term "bandaid" solution, for emergency cases and special situations only.