Intel's newest Quad Xeon MP versus HP's DL585 Quad Opteron
by Johan De Gelas on November 10, 2006 12:00 PM EST- Posted in
- IT Computing
Analyses: the Xeon MP and Opteron Server
A CPU is only one aspect of choosing a server; at the end of the day it is the server that you can afford that makes you decide for one platform or another. The 4U Intel SR4850HW4 isn't very different from the SR6850HW4, so we can compare our Xeon MP test machine to the HP Opteron server.
The Xeon MP offers much more in the way of RAS features than the Opteron machine. The HP DL585 also has a few shortcomings: it does not offer any PCIe expansion slots, the SCSI controller is an old SCSI 160 model, and there are no USB ports on the front of the machine. Being able to quickly load some network drivers from a USB stick is very convenient compared to tinkering in the back of your rack.
However, the HP is the winner for memory intensive HPC applications: it can use DDR1-400 DIMMs which are quite a bit faster than the DDR2-400 FB DIMMs Intel uses. We were disappointed that both 4U designs do not offer more than 4-5 disk bays. If you are a medium sized enterprise and you have only one or a few heavy duty database applications, you can save a lot of money if you don't have to buy an external storage rack. With a RAID-1 setup for the operating system and programs, you only have two disks left to install your database on a second RAID-1 partition. Both the HP DL585 and the Intel SR4850HW4 basically force you to invest in an external storage rack in this case. Some 3U solutions like Supermicro's offer 16(!) disk bays and might be a better fit for a compute intensive transactional database. The HP and Intel machine are more suited for a HPC machine or as the host of a SAN storage rack to house a massive database/ERP system.
To make a fair comparison between the Xeon MP and AMD Opteron 8xx platforms, we decided to compare the costs of similar HP Xeon MP and HP Opteron machines, configuring them as similarly as possible.
The price disadvantage of the Xeon MP is more than $2000, which is not huge but still tangible. It is the result of the fact that you have to pay an extra $400 per Xeon CPU and $1000 for two extra memory boards. It is possible to save $1000 if you only get two memory boards, but that is not advisable. As 4GB DIMMs are extremely expensive, this means that you limit your server to 16GB (8x2GB) and that you cannot use the more advanced RAS features such as memory mirroring.
Power
How much power can we save by choosing the 95W TDP Opteron over the 150W TDP Xeon MP? We tested all machines with only one power supply running. DBS and PowerNow! were not enabled.
Both machines use huge fast turning fans which consume a lot of energy. To give you an idea of what this means, while idling the power consumption of the Xeon MP machine fluctuated between 460W and 620W. The 620W figure was generated when all the fans where turned on, while the 460W result was measured when the fans were silent. The HP DL585 did not use this on/off fan system, and consumed 520W while running idle. Once running at 100% load, the Xeon MP consumed 200W more than the Opteron machine while running SPECjbb2005. For your information, our Supermicro system consumed 310 W with 4 GB and about 360 W with 12 GB of RAM
Conclusion so far
Yes, our testing is not done. We still have to test other databases, and we are running benchmarks with Bea's JVM while you are reading this. Those benchmarks will be presented in our Clovertown - Intel's new quad core server CPU - review. In this review we focused a little more on the actual servers. So what can we conclude so far?
The Xeon 7140MP "Tulsa" is nothing less than a massive improvement over the previous Xeon 7041: it consumes less, performs a lot better (see the SPEC int/fp numbers) and is much less expensive. The new Xeon MP needs fewer optimizations than the Opteron to perform well in Java applications. Or if we look at our preliminary Bea Webrockit numbers, it performs better than the quad Opteron with a highly optimized JVM in applications with a big memory footprint (like SPECjbb2005) thanks to its massive L3 cache. In applications where the large L3 cache doesn't play a big role, the relatively poor server performance of the "NetBurst" architecture becomes visible again: our MySQL benchmark runs a lot better on the AMD Opteron and Intel's newest Core architecture Xeons. Power consumption is still rather high though, and the HP Opteron server consumed over 230W less.
In a nutshell, the new Xeon MP will have a hard time convincing people who are leaning towards an Opteron server or want the best performance/watt. But on the other hand, the decent performance and superior RAS features will keep the customers who desire high availability in the Intel camp, while the previous Xeon MP was such a poor performer that many people had no other choice than the AMD Opteron in the quad socket market.
When "High-end RAS" is less important, the excellent performance of the Xeon 5160 based Supermicro 6015 server shows how much potential the Xeon DP "Clovertown" has. Clovertown is nothing more than two Xeon DP 51xx on one chip, but it could give our quad monsters a hard time. You will find out more very soon....
A CPU is only one aspect of choosing a server; at the end of the day it is the server that you can afford that makes you decide for one platform or another. The 4U Intel SR4850HW4 isn't very different from the SR6850HW4, so we can compare our Xeon MP test machine to the HP Opteron server.
Server Feature Comparison | ||
SR6850HW4 (Intel SR4850HW4) | HP DL585 Model 2006 | |
Hardware | ||
CPU | 4x Intel Xeon 70xx and 71xx | Opteron 8xx |
Fastest CPU | Xeon MP 3.4 GHz /16MB L3 | Opteron 885 2.6 GHz |
Max Mem Capacity | 64 GB DDR2 400 FB Dimms (16 x 4 GB) | 128 GB DDR266 32 GB DDR400 |
Mem Type | ECC DDR2 400 | DDR400/333/266 |
Chipset | E8501 | AMD 8000 chipset |
RAS | ||
ECC Memory | Yes | Yes |
Memory RAID | Yes | No |
Hot plug memory | Yes | No |
Memory Sparing | Yes | No |
Memory Mirroring | Yes | No |
Hotswappable PCI | Yes on PCI-X 133 and PCIe | No |
Hotswappable Fans | 6 (4) | 8 |
Hotswappable PSU | Yes, 1+1 | Yes,1+1 |
Integrated Onboard | ||
Video Chip | ATI RADEON 7000 VGA PCI | ATI Rage XL |
Video RAM | 16 MB SDRAM | 8 MB SDRAM |
Max. Resolution | 1600x1200 | 1280x1024 |
PCIe x16/x8 | 0/1 | 0/0 |
PCIe x4/x1 | 4/0 | 0/0 |
PCI-X (133/100) | 1/2 | 2/6 |
PCI | 0 | 0 |
USB Front | 3 | 0 |
USB Rear | 2 | 2 |
LAN | Intel Dual Gigabit | NC7782 Dual PCI-X Gigabit |
Server management | Intel Server management | HP Ilo |
Serial Ports | 1 | 1 |
Storage | ||
Controller | LSI Logic LSI53C1030 | HP Smart Array 5i Plus Ultra 3 |
Cache | Optional | 64 MB BBU |
Interface | Dual-Channel Ultra320 SCSI SCA | Dual-Channel Ultra320 SCSI SCA |
Disks | 10 (5) | 4 |
RAID | 0,1,1E | 0,1,1+0,5 |
5.25 bays | 2 | 1 |
Dimensions & Power | ||
Form Factor | 6U (4U) | 4U |
Weight (kg) | 60 (40) | 30 |
PowerSupply | 2x1570W | 2x 870W |
. | ||
URL | SR6850HW4 | HP DL585 2006 |
The Xeon MP offers much more in the way of RAS features than the Opteron machine. The HP DL585 also has a few shortcomings: it does not offer any PCIe expansion slots, the SCSI controller is an old SCSI 160 model, and there are no USB ports on the front of the machine. Being able to quickly load some network drivers from a USB stick is very convenient compared to tinkering in the back of your rack.
However, the HP is the winner for memory intensive HPC applications: it can use DDR1-400 DIMMs which are quite a bit faster than the DDR2-400 FB DIMMs Intel uses. We were disappointed that both 4U designs do not offer more than 4-5 disk bays. If you are a medium sized enterprise and you have only one or a few heavy duty database applications, you can save a lot of money if you don't have to buy an external storage rack. With a RAID-1 setup for the operating system and programs, you only have two disks left to install your database on a second RAID-1 partition. Both the HP DL585 and the Intel SR4850HW4 basically force you to invest in an external storage rack in this case. Some 3U solutions like Supermicro's offer 16(!) disk bays and might be a better fit for a compute intensive transactional database. The HP and Intel machine are more suited for a HPC machine or as the host of a SAN storage rack to house a massive database/ERP system.
To make a fair comparison between the Xeon MP and AMD Opteron 8xx platforms, we decided to compare the costs of similar HP Xeon MP and HP Opteron machines, configuring them as similarly as possible.
Price Comparison | ||
Server | HP ProLiant DL580 G4 3.20GHz | HP ProLiant DL585 G2 2.4GHz - Rack Server |
CPUs | 4x Intel Xeon MP 7130 M | 4x AMD Opteron 8216 DC |
Memory | 4 Memory boards x 2 x 1 GB DDR2-400 | 8x 1 GB DDR2-667 |
Storage | HP Smart Array P400/256 PCIe Controller | HP Smart Array P400/512 Controller with battery |
NIC | HP Dual embedded NC371i Gigabit | HP Dual embedded NC371i Gigabit |
PSU | Dual 910/1300W power supplies | Dual 910/1300W hot plug power supplies |
DVD | SlimLine DVD-ROM Drive (8x/24x) Option Kit | SlimLine DVD-ROM Drive (8x/24x) Option Kit |
Price | $15,343 | $13,184 |
The price disadvantage of the Xeon MP is more than $2000, which is not huge but still tangible. It is the result of the fact that you have to pay an extra $400 per Xeon CPU and $1000 for two extra memory boards. It is possible to save $1000 if you only get two memory boards, but that is not advisable. As 4GB DIMMs are extremely expensive, this means that you limit your server to 16GB (8x2GB) and that you cannot use the more advanced RAS features such as memory mirroring.
Power
How much power can we save by choosing the 95W TDP Opteron over the 150W TDP Xeon MP? We tested all machines with only one power supply running. DBS and PowerNow! were not enabled.
Power Requirements | ||
System | Configuration | Max / Idle Power Usage (100% / <1% CPU load, W) |
HP DL585 | 4 CPUs - 16 GB RAM | 657 / 520 |
Intel Xeon MP 7130M | 4 CPUs - 16 GB RAM | 885 / 460 (620) |
Both machines use huge fast turning fans which consume a lot of energy. To give you an idea of what this means, while idling the power consumption of the Xeon MP machine fluctuated between 460W and 620W. The 620W figure was generated when all the fans where turned on, while the 460W result was measured when the fans were silent. The HP DL585 did not use this on/off fan system, and consumed 520W while running idle. Once running at 100% load, the Xeon MP consumed 200W more than the Opteron machine while running SPECjbb2005. For your information, our Supermicro system consumed 310 W with 4 GB and about 360 W with 12 GB of RAM
Conclusion so far
Yes, our testing is not done. We still have to test other databases, and we are running benchmarks with Bea's JVM while you are reading this. Those benchmarks will be presented in our Clovertown - Intel's new quad core server CPU - review. In this review we focused a little more on the actual servers. So what can we conclude so far?
The Xeon 7140MP "Tulsa" is nothing less than a massive improvement over the previous Xeon 7041: it consumes less, performs a lot better (see the SPEC int/fp numbers) and is much less expensive. The new Xeon MP needs fewer optimizations than the Opteron to perform well in Java applications. Or if we look at our preliminary Bea Webrockit numbers, it performs better than the quad Opteron with a highly optimized JVM in applications with a big memory footprint (like SPECjbb2005) thanks to its massive L3 cache. In applications where the large L3 cache doesn't play a big role, the relatively poor server performance of the "NetBurst" architecture becomes visible again: our MySQL benchmark runs a lot better on the AMD Opteron and Intel's newest Core architecture Xeons. Power consumption is still rather high though, and the HP Opteron server consumed over 230W less.
In a nutshell, the new Xeon MP will have a hard time convincing people who are leaning towards an Opteron server or want the best performance/watt. But on the other hand, the decent performance and superior RAS features will keep the customers who desire high availability in the Intel camp, while the previous Xeon MP was such a poor performer that many people had no other choice than the AMD Opteron in the quad socket market.
When "High-end RAS" is less important, the excellent performance of the Xeon 5160 based Supermicro 6015 server shows how much potential the Xeon DP "Clovertown" has. Clovertown is nothing more than two Xeon DP 51xx on one chip, but it could give our quad monsters a hard time. You will find out more very soon....
88 Comments
View All Comments
duploxxx - Monday, November 13, 2006 - link
Its nice to say that the new Intel system's have the RAS support and the AMD one not, however keep in mind that you are using an old opteron socket (you can say you have the latest revision 2006).AMD's Opteron 800/200-series (1207-pin, Socket F). The 1207-pin Socket F "Santa Rosa" core AMD Opteron CPU features DDR-2 memory support and Virtualization technology, in addition to Memory RAS security.
Slappi - Sunday, November 12, 2006 - link
Please sell your Intel stock and then rewrite the article please.Thank you,
Slappi
LuxFestinus - Tuesday, November 14, 2006 - link
Taken from Scientia's post here:http://www.amdzone.com/index.php?name=PNphpBB2&...">AMD Forum Board
Kiijibari - Sunday, November 12, 2006 - link
They are misleading as it is unclear what you mean with "mem bandwidth".Is it FSB bandwidth ? System memory bandwidth ? CPU bandwidth ... ?
It is correct that Intel can deliver 21 GB/s from the memory, however one CPU so far can "just" can handle ~11GB/s. So why should 1 Xeon DP have a memory bandwidth of 21 GB/s ? That statement is not valid, if you limit it to one CPU.
Obviously, you meant the System memory bandwidth, but then I really wonder about your Opteron Socket-F numbers ...
First it would be only fair to write the system bandwidth for a 2P System(or whatever compares to the Intel configuration), too. This would be then ~21 GB/s, too, for a 2P Opteron System, 42 GB/s for a Quad System.
Then I wonder how you calculate that 8.5 GB/s mentioned with the Socket-F Opterons.
As far as I know, these chips support DDR2-667 and that means 10.6 GB/s, not 8.5. Please be fair and correct at least that obvious error ...
cheers
Kiijibari
spaceoddity - Saturday, November 11, 2006 - link
Hi Johan,Thank you very much for doing some Linux benchmarks. They are not easy to come by. There are virtually no Linux benchmarks for desktops (perhaps understandable, but frustrating for us Linux users), but for servers, which is where Linux has a sizeable presence, they are always welcome. I hope Anandtech continues to provide good Linux/UNIX benchmarks, and doesn't abandon them for more windoze benchmarks, which are everywhere anyway.
Cheers!
JohanAnandtech - Sunday, November 12, 2006 - link
Well I firmly believe the marketshare of linux servers can only grow and that therefore linux benchmarking will only get more important. A colleague of mine pointed out that Novell has launched e-directory: a very solid alternative to MS Small business server with the same functionality and ease of use, but much cheaper per connection, and with the ability to grow with the enterprise.It is just yet another reason why Linux on servers is so attractive besides much lower cost and much more control over your own IT infrastructure
Justin Case - Saturday, November 11, 2006 - link
In the OpenSSL 1024-bit signs, the quad Opteron has an almost 40% advantage over the Xeon when using 8 threads (in fact, that advantage rises to more than 90% when using optimized binaries), and is still the best of the bunch at 16 threads (32 wasn't tested), and yet the article text completely fails to mention this.It mentions the point where the (more expensive and more power-hungry) Xeon has its biggest advantage (4 threads, with a whooping 9% advantage over the Opteron), and the point where the Sun server (even more expensive) has its biggest advantage (32 threads, but the Opteron wins if using optimized binaries), but completely ignores the Opteron's trouncing of all the competition at 8 and 16 threads, and the fact that the Xeon 5160 cannot scale past its 4-thread peformance at all.
http://images.anandtech.com/reviews/it/2006/tulsa-...">http://images.anandtech.com/reviews/it/2006/tulsa-...
So the fact is the Opteron can handle a load of 10000 signs per second (over 12000 with optimized binaries), while the Xeon can't even reach 6000 (6200 with optimized binaries).
http://images.anandtech.com/reviews/it/2006/woodcr...">http://images.anandtech.com/reviews/it/...rest-lin...
And yet, according to the article, "the Opteron no longerbeats the Xeon". Huh? A 40% advantage isn't enough to win? Who compared the scores, Diebold?
So what if the Xeon performs better when you cripple the Opteron by reducing the number of threads? In any real-world situation, the server admin is going to use the number of threads that delivers the best performance (and is going to use the optimized binaries, of course, if he's competent). Just because the Xeon tops out at 4 threads doesn't mean the (better) results delivered by the Opteron should be discarded.
If this was a "normal" Anadtech article, I wouldn't be surprised by the bias and "selective reporting", but I never expected Johan to "tow the party line" like this.
JohanAnandtech - Sunday, November 12, 2006 - link
From the article:Yeah, I am really doing Intel a favor here, pointing out one of the weaknesses of their core architecture and showing yet another very weak point of Netburst.
Again from the article:
More than one thread per core doesn't give any performance advantage (unless you have a multithreaded CPU) so of course a Dual Xeon 5160 doesn't scale beyond 4 threads, just like a Dual Opteron. As openSSL scales almost perfectly, The important thing here is performance/core, as you don't want to pay for multi socket machine if you don't want to.
You should definitely read more carefully. "Selective reporting" would not include the MySQL, Power consumption or even the NUMA specjbb results as they are favorable for the Opteron.
Justin Case - Monday, November 13, 2006 - link
The fact is, the quad Opteron box reviewed (DL585) _can_ sustain higher performance than the Xeon 5160 (close to 90% higher, using optimized binaries), correct? So, unless the Opteron box costs twice as much as the 5160 box (identically supported and configured, apart from the CPUs / MB), it delivers more bang for the buck.Is this a server test or a CPU core test? It's filed under "IT / Computing", not under "CPU / Chipset", so I have to assume it's supposed to be the former.
So what if one server has twice (or 100 times) as many cores as the other? You might as well argue that the servers must be compared at the same clock speed, with the same amount of on-die cache, or with the same type of memory. All those things might be relevant when comparing CPU architectures (then again...), but not when you're comparing complete systems. The whole point of a server comparison is to see what kind of performance you get for the price. If one server is 70% more expensive but 80% faster, it's still a better deal for people who need the extra performance. That extra performance can be due to a higher clock speed, more CPUs, more cores per CPU, better memory bandwidth, a dedicated coprocessor, magic imps, whatever. But it doesn't make any sense to "compensate" for those variables (or for one of those variables) and ignore the fact that server X can and does deliver better performance than server Y when both make full use of their resources.
At 4 threads, the 5160 is the fastest system of those tested. So what if it has a 20% clock speed advantage? It's still the fastest, right? You're not going to artificially cripple its clock speed to match the others; doing that wouldn't make any sense (because, in the real world, no buyer / server admin would do that). So why cripple the other systems by limiting the number of threads they are running? In that test (with unoptimized binaries), the Sun box reaches the highest performance, period. With optimized binaries, the Opteron box manages to pull slightly ahead. Of course, then you have to take price into account, and maybe for a lot of people the 5160-based server will be the better deal, but you can't say it performs better when, objectively, it does not.
nah - Saturday, November 11, 2006 - link
Great job Johan---as always