Intel's newest Quad Xeon MP versus HP's DL585 Quad Opteron
by Johan De Gelas on November 10, 2006 12:00 PM EST- Posted in
- IT Computing
MySQL Configuration
As our loyal readers know from our previous MySQL adventures, the MySQL database is a highly tweakable but somewhat badly scaling database. Most workloads scale well from one to two cores, but from two to four cores scaling is very mediocre, and in the "SELECT intensive" workload that we benchmark even negative. This has surprised quite a few people, but it is an issue that the InnoDB team is well aware of, and the issue will be resolved in one of the next releases of InnoDB. Until then, we compiled version 5.0.26 with Peter Zaitsev's Mutex patch. This Patch gives much better scaling and performance. Scaling is no longer negative, and we saw a 20% to 40% increase going from two to four cores. However, our workload still doesn't scale beyond four cores, so we tested all CPUs with two CPUs and four cores. That way we have at least an impression on how the different server CPUs compare.
All testing was thus done with InnoDB as our storage engine in MySQL 5.0.26. We optimized for a server with 4GB of RAM. Here is our MySQL configuration:
The "query cache" was off, as we wanted to test worst case performance. Our test database is still the same ~1GB database. The workload consists of more than 90% selects, mostly a "read intensive" workload.
MySQL results
All numbers are expressed in queries per second (Y-axis), and the X-axis shows the number of concurrent accesses.
On average is the Xeon DP 5160 is about 22% faster than the Opteron. That means that the Opteron is clock for clock as fast as the Xeon 5160, which is not bad news for AMD at all, although Woodcrest currently has the raw clock speed advantage. Considering the HP DL585 can only use DDR-333 with the Opteron 880, the picture might even get better with the DL885 which can use DDR-400.
There is little doubt that MySQL is not the favorite application of the Xeon MP: the Opteron 880 beats Xeon MP by 20% to 30%. We have seen this before as the Opteron has always outrun "NetBurst" based CPUs in MySQL. The good news for Intel is that the new Core architecture is no less than 52% faster in MySQL when we compare the 3 GHz Xeon DP with the 3.2 GHz Xeon MP.
We also noted something strange: the Xeon MP performs better with hardware prefetch disabled. Below you can see our findings. All numbers are expressed in queries per second served by the server (Y-axis); and the X-axis shows the number of concurrent accesses.
Hardware prefetch lowers performance by about 1% to 4%, while Hyper-Threading allows the Xeon MP to make better use of its potential and increases performance by 7% to 9% at the higher concurrencies.
As our loyal readers know from our previous MySQL adventures, the MySQL database is a highly tweakable but somewhat badly scaling database. Most workloads scale well from one to two cores, but from two to four cores scaling is very mediocre, and in the "SELECT intensive" workload that we benchmark even negative. This has surprised quite a few people, but it is an issue that the InnoDB team is well aware of, and the issue will be resolved in one of the next releases of InnoDB. Until then, we compiled version 5.0.26 with Peter Zaitsev's Mutex patch. This Patch gives much better scaling and performance. Scaling is no longer negative, and we saw a 20% to 40% increase going from two to four cores. However, our workload still doesn't scale beyond four cores, so we tested all CPUs with two CPUs and four cores. That way we have at least an impression on how the different server CPUs compare.
All testing was thus done with InnoDB as our storage engine in MySQL 5.0.26. We optimized for a server with 4GB of RAM. Here is our MySQL configuration:
MySQL Configuration | |
default-storage-engine | InnoDB |
skip-external-locking | |
skip-locking | |
key_buffer | 256M |
. | |
table_cache | 64 |
max_allowed_packet | 1M |
thread_stack | 128K |
. | |
sort_buffer_size | 2M |
read_buffer_size | 2M |
innodb_buffer_pool_size | 1G |
. | |
thread_concurrency | 16 |
innodb_thread_concurrency | 16 |
innodb_additional_mem_pool_size | 8MB |
read_rnd_buffer_size | 8MB |
thread_cache | 64 |
max_heap_table | 256MB |
tmp_table | 128MB |
. | |
innodb_log_file_size | 250MB |
innodb_table_locks | 0 |
innodb_flush_log_at_trx_commit | 0 |
max_user_connections | 2000 |
max_connections | 2000 |
The "query cache" was off, as we wanted to test worst case performance. Our test database is still the same ~1GB database. The workload consists of more than 90% selects, mostly a "read intensive" workload.
MySQL results
All numbers are expressed in queries per second (Y-axis), and the X-axis shows the number of concurrent accesses.
On average is the Xeon DP 5160 is about 22% faster than the Opteron. That means that the Opteron is clock for clock as fast as the Xeon 5160, which is not bad news for AMD at all, although Woodcrest currently has the raw clock speed advantage. Considering the HP DL585 can only use DDR-333 with the Opteron 880, the picture might even get better with the DL885 which can use DDR-400.
There is little doubt that MySQL is not the favorite application of the Xeon MP: the Opteron 880 beats Xeon MP by 20% to 30%. We have seen this before as the Opteron has always outrun "NetBurst" based CPUs in MySQL. The good news for Intel is that the new Core architecture is no less than 52% faster in MySQL when we compare the 3 GHz Xeon DP with the 3.2 GHz Xeon MP.
We also noted something strange: the Xeon MP performs better with hardware prefetch disabled. Below you can see our findings. All numbers are expressed in queries per second served by the server (Y-axis); and the X-axis shows the number of concurrent accesses.
Hardware prefetch lowers performance by about 1% to 4%, while Hyper-Threading allows the Xeon MP to make better use of its potential and increases performance by 7% to 9% at the higher concurrencies.
88 Comments
View All Comments
JohanAnandtech - Saturday, November 11, 2006 - link
Well, we did mentione it at our price comparison. From a performance point of view, the G2 is within 2% of the DL585 given a similar configuration.Getting a server in the lab is not like getting a videochip for review. The machines are much more expensive, and you need much more time to review them properly. So OEMs are less likely to send you the necessary hardware. For a videocard they send out a $500 item that can be reviewed in a few weeks, maybe even a few days. For Server like these, they have to send out a $20000 machine and be able to miss it for a month or two at the least.
Viditor - Saturday, November 11, 2006 - link
I can certainly understand and empathise with the situation...and I did enjoy the article, Johan!
The reason I mentioned it is that line in your conclusion...
I thought that (considering the circumstances) it was a bit unfair and misleading...
JohanAnandtech - Saturday, November 11, 2006 - link
I just pointed out that it is a bit weird that a newer revision of the DL585 (it was thé HP Opteron machine just a few months ago) used SCSI 160. There is no reason at all why HP could not replace this: they revised the server anyway.I should mentioned that these results were solved in the G2, but still it is a missed chance... eventhough I reported it a bit too late :-)
photoguy99 - Friday, November 10, 2006 - link
yes, bring it on!finalfan - Friday, November 10, 2006 - link
On page The Official SPEC Numbers, in second table SPEC FP 2000 Performance, the positions of (4/8) HP Opteron AM2 and (8/8) Hitachi Itanium 2 should be switched. No Itanium runs at 3.4G and no way a 4way 1.6G AM2 can sit in second place.JohanAnandtech - Friday, November 10, 2006 - link
Corrected. It is weird, the accurate numbers were in the orginal document. The generation of the table went wrong. I have double checked and now the FP numbers should all be accurateJarredWalton - Friday, November 10, 2006 - link
Probably my fault. I think when it got put into Excel that the various x/y numbers were converted to dates. I thought I fixed all of those, but probably missed one or two. Sorry.icarus4586 - Friday, November 10, 2006 - link
This report brought to you by the department of redundancy department.
bwmccann - Friday, November 10, 2006 - link
When are you guys going to start benchmarking server CPUs using applications that are widely used in organizations on a daily basis?Most companies have a very high percentage of servers running Windows. With that I would love to see some test on SQL, Oracle, Exchange, and other core components of enterprises today.
Also it would be nice to see a closer comparison of the servers. For example you tested a DL585. A DL580 (Intel Woodcrest) would have been better suited since some of the components would be the same.
JohanAnandtech - Friday, November 10, 2006 - link
http://www.anandtech.com/IT/showdoc.aspx?i=2793">http://www.anandtech.com/IT/showdoc.aspx?i=2793Most of the time Jason does the Windows benchmarking, me and my team do the Linux benchmarking.
Java, MySQL and SSL are also core components of many enterprise apps.
We are working on Oracle and got access to a realworld Oracle database a few weeks ago (for the first time), but it takes time to really understand what your benchmark is telling you and how you must configure your db. And Oracle is ...very stubborn, even patching to a slightly higher version can lead to big trouble.
The DL585 is a direct competitor (quad socket) in this space, more so than the DL580 (DUal Socket)