CPU Benchmarks

The dynamics of CPU Turbo modes, both Intel and AMD, can cause concern during environments with a variable threaded workload. There is also an added issue of the motherboard remaining consistent, depending on how the motherboard manufacturer wants to add in their own boosting technologies over the ones that Intel would prefer they used. In order to remain consistent, we implement an OS-level unique high performance mode on all the CPUs we test which should override any motherboard manufacturer performance mode.

HandBrake v0.9.9: link

For HandBrake, we take two videos (a 2h20 640x266 DVD rip and a 10min double UHD 3840x4320 animation short) and convert them to x264 format in an MP4 container. Results are given in terms of the frames per second processed, and HandBrake uses as many threads as possible.

HandBrake v0.9.9 LQ Film

Low quality conversion loves faster individual cores, hence the W processor wins out due to its higher full-load frequency. Nonetheless, the fast consumer grade processors win here by a large margin.

HandBrake v0.9.9 2x4K

In full double-4K mode, the balance of cores, frequency and architecture upgrade puts the E5-2687W v3 above the 12-core E5-2697 v2.

Agisoft Photoscan – 2D to 3D Image Manipulation: link

Agisoft Photoscan creates 3D models from 2D images, a process which is very computationally expensive. The algorithm is split into four distinct phases, and different phases of the model reconstruction require either fast memory, fast IPC, more cores, or even OpenCL compute devices to hand. Agisoft supplied us with a special version of the software to script the process, where we take 50 images of a stately home and convert it into a medium quality model. This benchmark typically takes around 15-20 minutes on a high end PC on the CPU alone, with GPUs reducing the time.

Agisoft PhotoScan Benchmark - Total Time

Dolphin Benchmark: link

Many emulators are often bound by single thread CPU performance, and general reports tended to suggest that Haswell provided a significant boost to emulator performance. This benchmark runs a Wii program that raytraces a complex 3D scene inside the Dolphin Wii emulator. Performance on this benchmark is a good proxy of the speed of Dolphin CPU emulation, which is an intensive single core task using most aspects of a CPU. Results are given in minutes, where the Wii itself scores 17.53 minutes.

Dolphin Emulation Benchmark

A single emulation instance benefits from a fast single core.

WinRAR 5.0.1: link

WinRAR 5.01, 2867 files, 1.52 GB

WinRAR seems to enjoy Haswell-EP over Ivy-EP, although it stills needs a high frequency to achieve top speeds.

PCMark8 v2 OpenCL

A new addition to our CPU testing suite is PCMark8 v2, where we test the Work 2.0 suite in OpenCL mode. 

PCMark8 v2 Work 2.0 OpenCL with R7 240 DDR3

Hybrid x265

Hybrid is a new benchmark, where we take a 4K 1500 frame video and convert it into an x265 format without audio. Results are given in frames per second.

Hybrid x265, 4K Video

Hybrid also takes advantage of the new architecture, giving a 5% advantage to the E5-2687W v3 despite two fewer cores.

Cinebench R15

Cinebench R15 - Single Threaded

Cinebench R15 - Multi-Threaded

3D Particle Movement

3DPM is a self-penned benchmark, taking basic 3D movement algorithms used in Brownian Motion simulations and testing them for speed. High floating point performance, MHz and IPC wins in the single thread version, whereas the multithread version has to handle the threads and loves more cores.

3D Particle Movement: Single Threaded

3D Particle Movement: MultiThreaded

FastStone Image Viewer 4.9

FastStone is the program I use to perform quick or bulk actions on images, such as resizing, adjusting for color and cropping. In our test we take a series of 170 images in various sizes and formats and convert them all into 640x480 .gif files, maintaining the aspect ratio. FastStone does not use multithreading for this test, and results are given in seconds.

FastStone Image Viewer 4.9

Web Benchmarks

General usability is a big factor of experience, especially as we move into the HTML5 era of web browsing. For our web benchmarks, we take four well known tests with Chrome 35 as a consistent browser.

Sunspider 1.0.2

Sunspider 1.0.2

Mozilla Kraken 1.1

Kraken 1.1

WebXPRT

WebXPRT

Google Octane v2

Google Octane v2

Market Positioning, Test Setup, and Overclocking? Gaming Benchmarks
Comments Locked

27 Comments

View All Comments

  • TiGr1982 - Wednesday, October 15, 2014 - link

    This new workstation CPU, Xeon E5-2687W v3, as we see, is intended for multithreaded software.

    There are actually workstation CPUs better fitting for singlethreaded software: these are Xeon E3, e.g., Xeon E3-1286 v3 (3.7/4.1 GHz) and slower and cheaper models below it.
    These are essentially "professionalized" Core i7s for LGA1150.
    Being the same silicon as Core i7s for LGA1150, these E3s have their own downsides, however: only 32 GB of RAM and only 8 MB of L3 cache.

    And the really fastest in single threaded tasks is Core i7-4790K at 4.0/4.4 GHz, but it lacks ECC memory support.
  • hrrmph - Tuesday, October 14, 2014 - link

    I would like to encourage Ian and AT in general to continue to split the coverage (as they have been doing recently) for dual-socketed platforms into the "low-end" enthusiast / workstation segment, and the "high-end" more heavy-duty server / enterprise segment.

    Ian's recent articles hitting this from the "low-end" enthusiast / workstation angle have been really helpful to me, even though I've already been part-time "playing" with dual-socketed systems for some time, both as an educational exercise and a personal curiosity endeavor.

    In particular, the effects of NUMA aware software on dual-socketed system performance are of great interest.

    I've also noticed a lot of negative feedback to Ian's articles that I think is unwarranted. It's mostly from folks who want Ian to do more complex testing of more complex tasks that are primarily enterprise related. That's all well and good, but as I understand it, that is the job of the "other half" of AT to do.

    Ian and AT doing dual-socketed articles on "low-end" Windows builds is exactly what we need to help people know whether or not they would like to "step-up" from X99-E. It also is helpful so that folks know what they are really getting into if they go the dual-socketed route. As Ian pointed out in recent articles there are still some things that X99-E will do better and going into dual-socketed computing all "starry-eyed" isn't necessarily the best way to approach it.

    If there is anything that AT could use, it's actually even more comparative testing of X99 Haswell-E versus the C6xx Haswell-EP from a Windows workstation user's perspective. It would be great to see which taskings favored which platform in actual testing.

    Everyone has an opinion, but actually doing it is the best way to demonstrate what works and what doesn't.
  • mapesdhs - Thursday, October 16, 2014 - link


    Entirely agree! Good summary.

    Btw, disappointing to see the threaded CB R15 result for the 2687W is only 30% better
    than an oc'd 3930K (mine @ 4.7 gives 1221). Does confirm that to really best a 1-socket
    oc'd i7, one really has to move to a multi-socket platform, and then of course it boils down
    to whether the sw is written to match (eg. is Handbrake written as well as it could?)

    Ian.

    PS. I hasten to add, I'm a different Ian. :D
  • SanX - Tuesday, October 14, 2014 - link

    "And remember this rule Pinnochio for the rest of your life -- two processors with the factor of 1.5 difference are equal"
  • colonelclaw - Wednesday, October 15, 2014 - link

    Any chance you could include V-Ray in future benchmarks? It's multi-application and multi-platform and very popular in the CGI world.
  • mapesdhs - Thursday, October 16, 2014 - link

    And of course c-ray, which scales extremely well with multiple cores.

    Ian.
  • otherwise - Monday, November 17, 2014 - link

    In the future, is there any chance you can add a benchmark that stresses single-threaded integer performance? I'd love to see how much Int performance has changed from generation to generation, but most sites (including this one) seem to focus on FP performance.

Log in

Don't have an account? Sign up now