 |
 |
S@H: 4x vs 8x Mac Pro Performance
|
 |
|
 |
|
Senior User
Join Date: Jun 2006
Location: Dublin, CA
Status:
Offline
|
|
1) E.T from Tellus, dual quad (@2.33ghz) + 2gb RAM + alexkan's "Core 2-optimized v8-prerelease-nographics". RAC ~7200:
http://www.boincstats.com/stats/boin...amp;id=2019771
2) Bad to the bone, dual dual (@ 3.0ghz, I assume) + 2gb RAM + alexkan's "Intel, Core 2-optimized v8-prerelease-nographics". RAC ~5000.
http://www.boincstats.com/stats/boin...amp;id=1720553
Everything is identical except #1 has double the cores, and #2 is faster. So what's the deal here? I did not expect 2x with #1 obviously. But I did expect better than only a 44% increase.
If I understand correctly, most of the work stays between the CPU and the L2. If so, especially with the 8 core, I am wondering if lack of CPU affinity is making things worse for the 8-way? Each pair of cores share an L2. So with #1, you have only a 1 in 4 chance of having the data in the right L2. But with #1, a 1 in 2 chance.
Or do I just have over inflated expectations here?
|
|
|
| |
|
|
|
 |
|
 |
|
Administrator 
Join Date: May 2000
Location: California
Status:
Offline
|
|
If you correct for the MHz diff, the picture looks much more normal.
1) E.T. from Tellus: (8-core) 2.33 GHz = 7200 RAC. -> 3.0 GHz would equal 9270 RAC
2) Bad to the bone: (4-core) 3.0 GHz = 5000 RAC
85% improvement when MHz is normalized.
|
|
|
| |
|
|
|
 |
|
 |
|
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status:
Offline
|
|
Originally Posted by zombie67
Everything is identical except #1 has double the cores, and #2 is faster. So what's the deal here? I did not expect 2x with #1 obviously. But I did expect better than only a 44% increase.
If I understand correctly, most of the work stays between the CPU and the L2. If so, especially with the 8 core, I am wondering if lack of CPU affinity is making things worse for the 8-way? Each pair of cores share an L2. So with #1, you have only a 1 in 4 chance of having the data in the right L2. But with #1, a 1 in 2 chance.
Or do I just have over inflated expectations here?
While a good portion of the work that SETI does on a single WU does a good job of fitting in the L2 cache, there is also a sizable portion of the WU that doesn't fit as well, and which tends to bottleneck on the RAM's ability to keep the cores fed with data. These parts of the computation tend to be responsible for SETI's less-than-perfect scaling with the number of cores, and the poor performance of Mac Pros with only two installed DIMMs.
|
|
|
| |
|
|
|
 |
 |
|
 |
|
|
|
|
|

|
|
 |
Forum Rules
|
 |
 |
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is Off
|
|
|
|
|
|
 |
 |
 |
 |
|
 |
|