We explain why amd improves more than nvidia when going to directx 12

Surely you have read or heard that AMD graphics cards are much better in DirectX 12 than Nvidia, that the architecture used by the former is much more prepared to work with the new generation API. These are affirmations that we usually see every day, but is AMD really better than Nvidia in DirectX 12? We tell you everything you need to know in this post.

Overhead is the cause of AMD's enhancement with DirectX 12

Since DirectX 12 started talking, we have been seeing comparative graphs like the following:

These graphics compare two equivalent graphics cards such as the GeForce GTX 980 Ti and the Radeon R9 Fury X, if we go by the previous images we see that AMD have a brutal performance gain when going from DirectX 11 to DirectX 12, against Nvidia, it remains equal or even loses performance when starting to work with the new API. Seeing this, any user would think that the AMD card is much better than the Nvidia card.

Now we turn to look at the following image:

This time the graph compares the performance of the GeForce GTX 980 Ti and the Radeon R9 Fury X in DirectX 11 and DirectX 12. What we can see is that in DirectX 11 the Nvidia card yields almost double that of AMD and when moving to DirectX 12 the performance is equalized. We see that the Radeon R9 Fury X improves its performance a lot when going to work with DirectX 12 and the GeForce GTX 980 Ti improves much less. In any case, the performance of both under DirectX 12 is the same since the difference does not reach 2 FPS in favor of the Fury X.

At this point we have to ask ourselves why AMD has such an improvement when moving to DirectX 12 and Nvidia improves much less. Does AMD work better under DirectX 12 than Nvidia or is it having a big problem under DirectX 11?

The answer is that AMD has a big problem under DirectX 11, a problem that makes its cards perform worse than Nvidia's. This problem is related to the use that the card drivers make of the processor, a problem known as " Overhead " or overload.

AMD graphics cards make very inefficient use of the processor under DirectX 11, to check this problem we only have to look at the following videos that analyze the performance of the Radeon R7 270X and the GeForce GTX 750 Ti with a Core- i7 4790K and then with a Core-i3 4130. As we can see, the AMD graph loses much more performance when working with a much less powerful processor.

Far Cry 4

Ryse: Son of Rome

COD Advanced Warfare

The key to this is in the " command-queue " or command lists under DirectX 11. In a very simple and understandable way we can summarize it in that the AMD graphics cards take all the drawing calls to the API and put them in a single processor core, this makes them very dependent on the single-threaded power of the processor and therefore they suffer greatly when working together with a less powerful processor per core. This is why AMD's graphics suffered greatly with AMD FX processors, much less powerful per core than Intel's.

Instead Nvidia takes the draw calls to the API and divides them between the different processor cores, with this the load is distributed and a much more efficient use is made and less power is dependent on the processor core. As a consequence AMD suffers much more overhead than Nvidia under DirectX 11.

Checking the latter is very simple, we only have to monitor an AMD and an Nvidia graphics card under the same game and the same processor and we will see how in the case of Nvidia all the cores work in a much more balanced way.

This overhead problem is fixed under DirectX 12 and that's the main reason AMD graphics cards have a huge performance gain going from DirectX 11 to DirectX 12. If we look at the following graph we see how under DirectX 12 performance is no longer lost when going from a dual-core processor to one of four.

And why does AMD not do like Nvidia?

Nvidia's implementation of command-queues in DirectX 11 is very expensive, requiring a large investment of money and human resources. AMD has been in a bad financial situation so it does not have the same resources that Nvidia to invest. In addition, the future goes through DirectX 12 and there is no such overhead problem since the API itself is in charge of managing command-queues in a much more efficient way.

In addition, the Nvidia approach has the problem of being much more dependent on the optimization of the drivers, so Nvidia is usually the first to release new versions of its drivers every time an important game comes to market, although AMD has put the stacks on this lately. AMD's approach has the advantage of being much less dependent on drivers so its cards do not need new versions as urgently as Nvidia's, this is one of the reasons why Nvidia's graphics cards age worse with the passage of time when they are no longer supported.

And what about Asynchronous Shaders?

There has also been a lot of talk about Asynchronous Shaders, regarding this we only have to say that it has been given a lot of importance when in reality the overhead is much more important and determining the performance of the graphics card. Nvidia also supports them although its implementation is much simpler than AMD's, the reason for this is that its Pascal architecture works in a much more efficient way so it does not need Asynchronous Shaders as much as AMD.

AMD's graphics include ACEs, which are a hardware engine dedicated to asynchronous computing, a hardware that takes up space on the chip and consumes energy, so its implementation is not a whim, but due to a major deficiency of the Graphics Core architecture. Next from AMD with geometry. The AMD architecture is very inefficient when it comes to distributing the workload between the different Compute Units and the cores that form them, this means that many cores are out of work and therefore wasted. What ACEs and Asynchronous Shaders do is "give work" to these nuclei that have remained unemployed so that they can be exploited.

In the other part we have the Nvidia graphics based on the Maxwell and Pascal architectures, these are much more efficient in geometry and the number of cores is much lower than that of the AMD graphics. This makes the Nvidia architecture much more efficient when it comes to dividing the work and not as many cores are wasted as in the case of AMD. The implementation of the Asynchronous Shaders in Pascal is done through software, since making a hardware implementation would not provide almost any performance advantage, but it would be a drag on the size of the chip and its energy consumption.

The following graph shows the performance gain of AMD and Nvidia with Mark Time Spy 3D Asynchronous Shaders: