The Snapdragon 865 Performance Preview: Setting the Stage for Flagship Android 2020
by Andrei Frumusanu on December 16, 2019 7:30 AM EST- Posted in
- Mobile
- Qualcomm
- Smartphones
- 5G
- Cortex A77
- Snapdragon 865
CPU Performance & Efficiency: SPEC2006
We’re moving on to SPEC2006, analysing the new single-threaded performance of the new Cortex-A77 cores. As the new CPU is running at the same clock as the A76-derived design of the Snapdragon 855, any improvements we’ll be seeing today are likely due to the IPC improvements of the core, the doubled L3 cache, as well as the enhancements to the memory controllers and memory subsystem of the chip.
Disclaimer About Power Figures Today:
The power figures presented today were captured using the same methodology we generally use on commercial devices, however this year we’ve noted a large discrepancy between figures reported by the QRD865’s fuel-gauge and the actual power consumption of the device. Generally, we’ve noted that there’s a discrepancy factor of roughly 3x. We’ve reached out to Qualcomm and they confirmed in a very quick testing that there’s a discrepancy of >2.5x. Furthermore, the QRD865 phones this year again suffered from excessive idle power figures of >1.3W.
I’ve attempted to compensate the data as best I could, however the figures published today are merely preliminary and of lower confidence than usual. For what it’s worth, last year, the QRD855 data was within 5% of the commercial phones’ measurements. We’ll be naturally re-testing everything once we get our hands on final commercial devices.
In the SPECint2006 suite, we’re seeing some noticeable performance improvements across the board, with some benchmarks posting some larger than expected increases. The biggest improvements are seen in the memory intensive workloads. 429.mcf is DRAM latency bound and sees a massive improvement of up to 46% compared to the Snapdragon 855.
What’s interesting to see is that some execution bound benchmarks such as 456.hmmer seeing a 28% upgrade. The A77 has an added 4th ALU which represents a 33% throughput increase in simple integer operations, which I don’t doubt is a major reason for the improvements seen here.
The improvements aren’t across the board, with 400.perlbench in particular seeing even a slight degradation for some reason. 403.gcc also saw a smaller 12% increase – it’s likely these benchmarks are bound by other aspects of the microarchitecture.
The power consumption and energy efficiency, if the numbers are correct, roughly match our expectations of the microarchitecture. Power has gone up with performance, but because of the higher performance and smaller runtime of the workloads, energy usage has remained roughly flat. Actually in several tests it’s actually improved in terms of efficiency when compared to the Snapdragon 855, but we’ll have to wait on commercial devices in order to make some definitive conclusions here.
In the SPECfp2006 suite, we’re seeing also seeing some very varied improvements. The biggest change happened to 470.lbm which has a very big hot loop and is memory bandwidth hungry. I think the A77’s new MOP-cache here would help a lot in regards to the instruction throughput, and the improved memory subsystem makes the massive 65% performance jump possible.
Arm actually had advertised IPC improvements of ~25% and ~35% for the int and FP suite of SPEC2006. On the int side, we’re indeed hitting 25% on the Snapdragon 865, compared to the S855, however on the FP side we’re a bit short as the increase falls in at around 29%. The performance increases here strongly depend on the SoC and particular on the memory subsystem, compared to the Kirin 990’s A76 implementation the increases here are only 20% and 24%, but HiSilicon’s chip also has a stronger memory subsystem which allows it to gain quite more performance over the A76’s in the S855.
The overall results for SPEC2006 are very good for the Snapdragon 865. Performance is exactly where Qualcomm advertised it would land at, and we’re seeing a 25% increase in SPECint2006 and a 29% in SPECfp2006. On the integer side, the A77 still trails Apple’s Monsoon cores in the A11, but the new Arm design now has been able to trounce it in the FP suite. We’re still a bit far away from the microarchitectures catching up to Apple’s latest designs, but if Arm keeps up this 25-30% yearly improvement rate, we should be getting there in a few more iterations.
The power and energy efficiency figures, again, taken with a grain of salt, are also very much in line with expectations. Power has slightly increased with performance this generation, however due to the performance increase, energy efficiency has remained relatively flat, or has even seen a slight improvement.
178 Comments
View All Comments
joms_us - Tuesday, December 17, 2019 - link
Ah poorman's attempt to hide the truth. I feel sorry for those buying a phone (even replacing a desktop) because they see it flying with colors in SPEC.Andrei Frumusanu - Tuesday, December 17, 2019 - link
You're just a blabbering idiot. You keep pulling things out your ass, nobody ever said A9 is faster than Ryzen or Skylake, I dare you find a quote or data that says that. The A13 was the first to *match* them.The test you quote isn't ST like the SPEC results, and it's not even a full CPU test as it has API components.
joms_us - Tuesday, December 17, 2019 - link
Ahh the irony... Let's see who is the blabbering !d!ot here.You reminded us on who the IPC gorila is...
https://twitter.com/andreif7/status/11569659188089...
There it shows A13 and even A9 stomping the latest and greatest Ryzen and Skylake processors
But then when you compare the A13 versus the Android SoC in various apps and websites, it is the complete opposite.
I respect you because you have an excellent knowledge in what you do but it comes down to the toilet drain once your critical thinking is subpar and you are shadowed with your ego that you think yours and only yours speak the truth. I would not hesitate to hire you as my design engineer really but you have to back your claims with facts. When you state one is the fastest (especially by huge margin), it has to reflect in any test that you throw at it.
I would rest my case if you can convince Lisa or Bob that their processors are mediocre compared to Apple's latest SoC LOL.
Andrei Frumusanu - Tuesday, December 17, 2019 - link
That tweet is about IPC of the microarchitectures, not absolute performance.You literally have absolutely not a single whim of understanding of what's going on here and keep making a complete utter fool of yourself repeating lies, all you see is a bar graph being bigger than the other and suddenly that's the your whole basis on the truth of the world.
The actual engineers and architects in the industry very well know where they lie in relation to what's Apple's doing; I don't need to convince anybody.
joms_us - Tuesday, December 17, 2019 - link
No, you just told the whole world, that the fastest chip on the planet is the Apple SoC. A chip with great IPC will give great performance result, right? Your graph is telling us, a 1Ghz A12x core is equivalent to a 2Ghz Ryzen core which is utter BS. When AMD or Intel announce that their next processor has 20% IPC improvement, it does show in any tool/benchmark or app you throw at it not the opposite.Your tests methodology/tools are completely flawed and outdated as they don't translate to real world results. They are great though if you are comparing two similar platforms.
Andrei Frumusanu - Tuesday, December 17, 2019 - link
> No, you just told the whole world, that the fastest chip on the planet is the Apple SoCI did not. High IPC doesn't just mean it's the fastest overall. AMD and Intel still have a slight lead in over performance.
> A chip with great IPC will give great performance result, right?
As long as the clock-rate also is high enough, yes.
> Your graph is telling us, a 1Ghz A12x core is equivalent to a 2Ghz Ryzen core
That's exactly correct. Apple current has the highest IPC microarchitecture in the industry by a large margin.
> which is utter BS.
The difference between you and me is that I actually have a plethora of data to back this up, actual instruction counter data from the performance counters, actual tests tests that show that Apple's µarch is in fact 50% wider than anything else out there.
You are doing absolutely nothing than spewing rubbish comments with absolutely zero understanding of the matter. You have absolutely nothing to back up your claims about flawed and outdated methodologies, while I have the actual companies who design these chips agreeing with the data I present.
arsjum - Wednesday, December 18, 2019 - link
Andrei,As a member of Anandtech staff, you should be better than this. This is not an XDA forum.
Come on.
LordConrad - Tuesday, December 17, 2019 - link
Now if Samsung could just increase the anemic L2 cache. I want 1MB per A7x core and 512KB per A5x core.yankeeDDL - Tuesday, December 17, 2019 - link
It is truly disappointing that Android HW needs to run on SoC with the performance of the iPhone 3-4 generations older.I really don't understand with all the demand there is, why nobody comes up with something at least within the range of Apple's SoC.
Wilco1 - Tuesday, December 17, 2019 - link
You mean 2 generations behind at most on SPEC. And while interesting technically, it remains debatable how much that actually matters in actual phone use (where having fast SSD, download speeds and a lot of memory can help more). As well as having ~20% better power efficiency of course.It would be relatively easy to quadruple L2 to 1MB, L3 to 8MB and system cache to 16MB and get ~20% performance gain on SPEC. The area would be much larger and hence the cost of the SoC which would add to the cost of phones. QC's competitors would be happy to increase their market share with far cheaper SoCs which are equally fast in real-world usage.