Battlefield 4 Mantle Performance Preview
by Ryan Smith on February 1, 2014 12:40 PM ESTAfter a false start or two, AMD is finally getting the first beta of Mantle out the door. With EA DICE having shipped their Mantle patch for Battlefield 4 and developer Oxide having released their Star Swarm technical demo, the first Mantle-enabled applications have landed. Meanwhile AMD for their part is still hammering out an installation issue on their new Mantle-enabled Catalyst drivers, which has led to them missing their previously scheduled January release date.
In the interim, AMD has released a slightly finickier set of drivers to the press for us to play around with ahead of the public Mantle driver release. These drivers should be functionally and performance identical to the public drivers, they just have an outstanding installation bug that requires a workaround, something that AMD doesn’t want in the shipping version. AMD hasn’t provided a public release date for these drivers – at this point it’s in their best interest to avoid providing release dates they don’t know if they can keep – but given the fact that this is the sole showstopper issue in our press drivers, we certainly don’t expect they’ll take much longer.
In any case, we’re hard at work at the moment putting together our full evaluation of this first version of Mantle. That article won’t be ready until next week, but in the meantime given the immense interest in Mantle, we wanted to quickly publish our first batch of numbers for Battlefield 4. We will have a much wider selection of benchmarks for our full article, including many more video cards and results for Star Swarm, but we wanted to quickly bring you what’s almost certainly going to be the most interesting set of data: Mantle performance with a high-end video card.
For that we’re turning to AMD’s Radeon R9 290X, testing the performance of that card under both Direct3D and Mantle in EA’s Battlefield 4. Battlefield 4 is Mantle’s showcase title and accordingly the first real world use case for AMD’s new API, making it the best place to start. As an application retrofitted with Mantle support we don’t expect Battlefield 4 to tap the complete potential of Mantle right out of the door – certainly not when the Mantle SDK and driver stack itself is still in development – but it can give us an idea of what kind of performance gains we can expect if developers chase the low-hanging fruit offered by Mantle.
What is that low-hanging fruit? For the most part that is going to be CPU bottlenecks, specifically bottlenecking in issuing draw calls. Of all of the bottlenecks that can impact a high performance GPU, keeping it fed can be the biggest bottleneck, and in turn bottlenecking in the draw call submission phase can be the biggest culprit. In the long term Mantle will also benefit GPU performance more directly by optimizing workflows within a GPU, and we already see a small bit of that today in Battlefield 4, but the bulk of the optimizations for these earliest titles have been made around the draw call bottleneck.
For our Mantle preview we’re taking a look at two sections of the Battlefield 4 single player game, the first being from the Tashgar mission and the second being from the South China Sea mission. As was the case with Battlefield 3 the use of single player is less than ideal, but as Battlefield 4 lacks a formal benchmark or for that matter the ability to record multiplayer matches, we’re left with single player if we want to have reasonably repeatable benchmarks. And we’ll definitely want a high degree of repeatability if we’re to be able to distinguish Mantle gains from variability in GPU bound scenarios.
Meanwhile to cover a wider spectrum of possibilities, we’re running our 290X against 3 CPU configurations on our GPU testbed. The first of which is our standard configuration, which is our i7-4960X with all cores and HypterThreading enabled (6C/12T), running at 4.2GHz. Our second configuration drops that down to 4C/4T at 2GHz, to test for the benefits of Mantle on a still relatively large core count at lower clockspeeds. Our final configuration takes the core count down further, to 2C/4T at 3GHz, so that we can see what performance is like for processors with fewer cores but higher clockspeeds.
Finally, on a quick note, for measuring Battlefield 4's performance we're using the game's newly built in PerfOverlay.FrameFileLogEnable feature, which replaces FRAPS in this game due to the fact that FRAPS only works with Direct3D and OpenGL. FrameFileLogEnable logs frame times for later analysis, and from this we can reconstruct the minimum and average framerates, and even the full frame pacing performance of the game (but only from the perspective of the game, not the video card). Today we'll be looking at just the average framerates, but be sure to come back next week for our full evaluation, where we'll have frame pacing data and minimum framerates ready to go.
CPU: | Intel Core i7-4960X @ 4.2GHz |
Motherboard: | ASRock Fatal1ty X79 Professional |
Power Supply: | Corsair AX1200i |
Hard Disk: | Samsung SSD 840 EVO (750GB) |
Memory: | G.Skill RipjawZ DDR3-1866 4 x 8GB (9-10-9-26) |
Case: | NZXT Phantom 630 Windowed Edition |
Monitor: | Sharp PN-K321 |
Video Cards: | AMD Radeon R9 290X (Uber) |
Video Drivers: | AMD Catalyst 14.1 Beta |
OS: | Windows 8.1 Pro |
SP-Tashgar
Our first test comes from the Tashgar mission, and is the benchmark we will be using for day-to-day GPU benchmarking. This benchmark takes place immediately at the start of the mission, with our character driving out of the mountains and into the city of Tashgar. This benchmark has a limited CPU load and is GPU-bound in most situations, which potentially limits the benefits of Mantle in alleviating CPU bottlenecks, but gives us an idea of what kind of performance benefits we can expect in GPU-bound scenarios.
Battlefield 4 Tashgar: Mantle Performance Gains | |||||
Ultra | High | Low | |||
i7-4960X 6C/12T @ 4.2GHz | 8% | 10% | -14% | ||
i7-4960X 4C/4T @ 2GHz | 8% | 13% | 26% | ||
i7-4960X 2C/4T @ 3GHz | 8% | 13% | 28% |
Even at 1080p Ultra, where the Radeon R9 290X is clearly GPU-bound, we can see that switching to Mantle offers some performance improvements. With our i7-4960X fully powered up, this leads to an 8% performance increase, and we see similar performance increases even with other CPU configurations. Since we don’t appear to be CPU-bound in any appreciable way, this gives us a decent idea of what kind of GPU performance benefits Mantle can offer.
Meanwhile if we switch to High and Low settings, the higher framerates are able to tease out the CPU benefits of Mantle. With BF4’s High settings this is 10-13% depending on the CPU configuration, which indicates we’re still significantly GPU bound here.
Using Low quality settings on the other hand significantly widens the gap in both directions, with the minimum gain being -14%, and the maximum gain being 28%. In the case of our 6C/12T CPU configuration, Mantle actually has a detrimental impact on performance, bringing down our framerate from a positively absurd 216fps to a slightly less absurd 181fps. This was unexpected to say the least, and while we’re not particularly concerned about it given the fact that we have little reason to use this setting in day-to-day gaming, but it does point to a weakness in the current builds of BF4 and the Mantle drivers.
Otherwise if we move to our slower CPU configurations, the benefits are 26% and 28% for 4C/4T and 2C/4T respectively. Despite the fact that the 4C/4T setup has more real cores to work with, which under normal circumstances would be the stronger setup for a highly threaded application, it’s the 2C/4T setup that technically benefits the most. The difference is quite small, but it’s an interesting outcome none the less.
SP-South China Sea
Our second test comes from the South China Sea mission of Battlefield 4, where our character and his squad are on the quickly disintegrating USS Titan. Whereas our first test is rather uniformly GPU-bound, the breakup of the USS Titan offers us the chance to look at a more CPU-bound scenario. Even this scene isn’t exclusively CPU-bound, but with ship parts and other debris flying around everywhere, it’s going to be one of the more strenuous CPU workloads in the single player game.
Battlefield 4 South China Sea: Mantle Performance Gains | |||||
Ultra | High | Low | |||
i7-4960X 6C/12T @ 4.2GHz | 7% | 8% | 7% | ||
i7-4960X 4C/4T @ 2GHz | 10% | 26% | 17% | ||
i7-4960X 2C/4T @ 3GHz | 10% | 30% | 28% |
Starting once again at 1080p Ultra, even with the greater CPU workload presented by this test, we are unsurprisingly still GPU-bound on Ultra settings. The benefits aren’t as uniform as last time – they now range from 7% to 10% – but it’s safe to say that we’re once again seeing what are mostly the GPU performance benefits of Mantle.
However shifting to High quality shows much greater performance gains, indicating that we’re at least partially (if not fully) CPU-bound here. Once we reduce our CPU performance from 6C/12T to 4C/4T, the performance gains from using Mantle jump from 8% to 26%, and then to 30% when using our 2C/4T configuration. For a game that’s not immensely CPU bound in the first place and has been retrofitted for Mantle, this is towards the upper bound of what we would expect.
Finally switching over to our Low quality settings causes our performance gains to actually taper off some. We’re still CPU-bound on our 4C/4T setup leading to a 17% performance gain, but we’re not as CPU-bound as we were at High quality settings, apparently. Meanwhile the performance gains for 2C/4T remain similar to last time, at 28%. Battlefield 4 has multiple CPU tasks going on here, not the least of which is the simulation itself, so in the case of our 4C/4T setups it’s likely we’ve stumbled onto a situation where the game is more strongly CPU-bound by the simulation and other aspects of the game than it is the submission of draw calls.
First Thoughts
As this is only a brief preview of our results we don’t intend to read too much into this limited data set, but even just looking at the 290X does provide us with some interesting data. For the pure high-end scenario – a 290X or similar GPU with a high-end CPU – Mantle can still offer performance benefits from the GPU workflow optimizations it provides. A 7-10% performance increase is not a dramatic difference, but it is 7-10% better performance than AMD had yesterday.
Meanwhile it comes as little surprise that the greatest performance benefits in our limited BF4 testing come in the mixed performance scenarios, pairing up a high-end GPU with slower CPUs. Since the lowest hanging fruit for Mantle optimizations is going to be CPU draw call bottlenecks, it’s going to be the weaker CPUs that have the most to gain here. In this case we still need to go out of our way to create CPU-bound scenarios – the 290X is rarely held back by the CPU on Ultra quality settings – but when we do create them we can see some of potential that Mantle can offer. At High and Low quality settings, and excluding our one Mantle performance regression, we see performance gains anywhere between 7% and 30%. This shows (if nothing else) that even a retrofit game with a highly optimized Direct3D rendering path, like Battlefield, can still be bottlenecked by draw call performance. And that consequently some of Mantle’s CPU overhead reduction capabilities do in fact pan out.
As for whether all of this is worth the costs and tradeoffs of Mantle from both a consumer perspective and a developer perspective is a longer discussion that we’ll be having next week, alongside our expanded benchmark results. But at first glance it looks like AMD has cleared the first hurdle, which is showcasing that there are tangible benefits to having a low-level graphics API. Now AMD just needs to further hammer out their Mantle drivers and get them into a public-consumable state, so that the wider community of end-users can test and evaluate AMD’s Mantle offering. Outside of the known installation issue we have not encountered any issues with Mantle thus far – this being despite the fact that AMD is being very explicit about the beta nature of the Mantle stack – so hopefully this is a good omen for the company after the delays leading up to this point.
AMD's Official Performance Data
Finally, we’ll quickly close with some of AMD’s performance numbers, which they’ve published in their reviewer’s guide. We feel that vendor-provided should always be taken with a grain of salt, but they do serve their purpose, especially for getting an idea of what performance is like under a best case scenario. To that end we can quickly see that AMD was able to top out at a 41% performance improvement on a 290X paired with an A10-7700K. This is a greater performance gain than the peak gain of 30% we’ve seen in our own results, but not immensely so. More importantly it can give us a good idea of what to reasonably expect for performance under Battlefield 4. If AMD’s results are accurate, then a 40% performance improvement is the most we should be expecting out of Battlefield 4’s Mantle renderer.
135 Comments
View All Comments
chizow - Sunday, February 2, 2014 - link
Not really, 7-10% is what one might expect from a big driver update or optimization, not a 2 year project to build an API from scratch for a small % of vendor-specific hardware.If AMD feels this is how to best use their resources to the benefit of their customers, more power to them. I think as an Nvidia user I am much more interested in Nvidia's efforts to work within existing APIs like DX and OpenGL and focusing their close-to-metal efforts in hardware, ie. their embedded Denver core in upcoming Maxwell. Another example would be G-Sync, which focuses on improving frame quality especially at low FPS, which would provide a benefit in every game instead of just a handful that require specific hardware, API, or dev support of that API.
I certainly don't want the industry to move to multiple vendor specific APIs, that does no one any good and would certainly be step backwards to the days every game had to support multiple vendor-specific APIs and codepaths.
mikato - Wednesday, February 5, 2014 - link
Would Nvidia users discourage Nvidia from developing a new API for a 30% gain in CPU limited situations? Hell no. Would Nvidia do it? Yes probably since that's a pretty big performance advantage. However they don't make CPUs so they don't have the demand on both sides like AMD does.You just cut out a small slice of the picture (of course the minimum performance advantage), and used that to argue against it, and even then I don't agree.
chizow - Wednesday, February 5, 2014 - link
Would I discourage Nvidia from developing an API for larger gains in unlikely scenarios??? Absolutely! It's a big performance gain in scenarios that are unlikely to occur (slow CPU + fast GPU) or unlikely to majorly benefit in the real world (low resolution/settings or multi-GPU already at high framerates).I would certainly prefer the alternative that Nvidia took, developing new technologies like G-Sync that can improve frame and image quality at ANY FPS in ANY game that uses Nvidia hardware.
There's also growing evidence that Nvidia took the steps to improve their drivers in existing APIs that AMD did not, mainly with support of Deferred Contexts and Command Lists in their DX11 Multi-threaded Rendering implementation. It has been known for some time AMD does not support these features while Nvidia does and the evidence is clear, even with Mantle, comparable Nvidia hardware running DX11.2 MTR (Win8.1) is faster.
http://techreport.com/review/25995/first-look-amd-...
Gasaraki88 - Thursday, February 6, 2014 - link
Hate fanboys like you. So far it seems like Mantle works best in certain scenarios. If I have to run my games in low settings to get a 25%+ increase in FPS, who cares. 180FPS+ to 210FPS+ is nothing anyone should care about. What matters is good cpu with good card and if they can increase performance by 25+% then that good. 7-10% is not good enough to have the developers just spend time to make a Mantle API. Developers not not even using DX11 fully yet, it will take them forever to use Mantle. If AMD can't make Mantle API also run on nVidia hardware then it's useless.Slomo4shO - Saturday, February 1, 2014 - link
Thank you Ryan, the Battlefield 4 Tashgar benchmarks at 4.2GHz seems to be an anomaly... Did you transpose the numbers?Ryan Smith - Saturday, February 1, 2014 - link
I'm assuming you're referring to the performance regression with low quality settings. No, that number has been double checked.Slomo4shO - Saturday, February 1, 2014 - link
Yes, I was referring to the low quality settings. Hmm those are odd results...rtsurfer - Saturday, February 1, 2014 - link
Can we expext a proper Review with AMD FX line CPUs & APUs & Nvidia 780 &780 Ti sometime later this week..??bj_murphy - Saturday, February 1, 2014 - link
Can't do a 780/780Ti Mantle review, it doesn't run on Nvidia cards. However, I agree with you on hoping for some AMD FX series CPUs as well as APUs compared here.Fetzie - Saturday, February 1, 2014 - link
You can, however, say "nVidia cards got x result in this benchmark. AMD cards with DX11 got y result in this benchmark. AMD cards with Mantle got z result in this benchmark". So long as the benchmarks remain the same, you can still compare performance numbers just the same as comparisons up until now between AMD and nVidia cards..