 |
 |
Got my new G5. Dnetc is slow???
|
 |
|
 |
|
Dedicated MacNNer
Join Date: May 2001
Location: Front of my Intel iMac 20"
Status:
Offline
|
|
Got my dual 2.0 Second generation G5. I am running the latest RC5-72 client on it and getting a max of 576 blocks in 24 Hrs.
With my previous Dual 800 I used to get 380 in 24 Hrs. Shouldn't I be getting more than a 1000 blocks a day? What are others getting?? Is the Client not optimised or something???
Sorry if this was posted before. I tried to search but could not find anything
Thanks
|
|
iMac Intel Core Duo 2.0 Ghz 20", 1.5 GB RAM, 250GB
iMac G5 2.0 Ghz 17", 512 MB RAM, 160GB
iPod Video 5G 60GB White
Mighty Mouse sucks - "Bought the Logitech 518 Gaming mouse"
USB 2.0 Hard Drive Sucked - "Bought a Firewire Hard Disk"
|
| |
|
|
|
 |
|
 |
|
Moderator 
Join Date: Sep 2001
Location: Arizona
Status:
Offline
|
|
DNETC is largely optimized for AltiVec, and the G4's AltiVec engine is a good bit faster than the G5's.
My Dual 1.40GHz G4 did around 530 per day, FYI.
|
|
I like chicken
I like liver
Meow Mix, Meow Mix
Please de-liv-er
|
| |
|
|
|
 |
|
 |
|
Dedicated MacNNer
Join Date: May 2001
Location: Front of my Intel iMac 20"
Status:
Offline
|
|
Ok thats fine. But can they optimise it for G5 for a better result???
Just for information.
|
|
iMac Intel Core Duo 2.0 Ghz 20", 1.5 GB RAM, 250GB
iMac G5 2.0 Ghz 17", 512 MB RAM, 160GB
iPod Video 5G 60GB White
Mighty Mouse sucks - "Bought the Logitech 518 Gaming mouse"
USB 2.0 Hard Drive Sucked - "Bought a Firewire Hard Disk"
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: Jul 2002
Location: Syracuse
Status:
Offline
|
|
Lat what happened to your 1.47 G4?
|

Imac Core Duo 1.83/1.5 GB/20 inch cinema, ibook G4 1 ghz
|
| |
|
|
|
 |
|
 |
|
Moderator 
Join Date: Sep 2001
Location: Arizona
Status:
Offline
|
|
Originally posted by addiecool:
Ok thats fine. But can they optimise it for G5 for a better result???
Just for information.
Theres not really anything to optimize for.
|
|
I like chicken
I like liver
Meow Mix, Meow Mix
Please de-liv-er
|
| |
|
|
|
 |
|
 |
|
Moderator 
Join Date: Sep 2001
Location: Arizona
Status:
Offline
|
|
Originally posted by Weezer:
Lat what happened to your 1.47 G4?
It's still here. For the moment, I think I have decided to upgrade the Cube. But I may change my mind back to upgrading the Digital Audio by the time the upgrades ship.
|
|
I like chicken
I like liver
Meow Mix, Meow Mix
Please de-liv-er
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Feb 2001
Location: Orange County, California
Status:
Offline
|
|
Originally posted by addiecool:
Ok thats fine. But can they optimise it for G5 for a better result???
I've talked with kakace, the author of a good number of the PowerPC cores for dnetc, including the G4 AltiVec core. Basically, he said that RC5 is an algorithm optimized for 32-bit processors, and that to effectively "convert" the data to a native 64-bit digestible form it would take more processor cycles than it would be worth, resulting in slower numbers than the G4 AltiVec core running on the G5 as it is.
This is a very dumbed down explanation, as I am not an assembly programmer. I believe that that explanation is the gist of it.
|
|
The Bighead
- MacBook Pro 15" Matte non-unibody 2.6 GHz, 4GB RAM, 120/SSD & 1TB/5400
- PM G4 Dual 1.25 GHz, 2 GB RAM, 1x1TB Boot - 1x2TB TM Backup - 2x3TB Archive/Backup
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: Aug 2001
Status:
Offline
|
|
Originally posted by bighead:
I've talked with kakace, the author of a good number of the PowerPC cores for dnetc, including the G4 AltiVec core. Basically, he said that RC5 is an algorithm optimized for 32-bit processors, and that to effectively "convert" the data to a native 64-bit digestible form it would take more processor cycles than it would be worth, resulting in slower numbers than the G4 AltiVec core running on the G5 as it is.
This is a very dumbed down explanation, as I am not an assembly programmer. I believe that that explanation is the gist of it.
That doesn't make much sense to me. Altivec, even on the G5, can't handle 64 bit numbers, so I can't see why one would convert the data to 64 bit. If I had to guess, one of two things is happening: a) the dispatch restrictions on the G5's altivec unit (identical to an original G4) are slowing it down, or b) it uses G4-specific streaming instructions that can cause slowdowns on the G5. In general, the Altivec unit on the G4 is slightly superior, but the G5 has a higher clock frequency and much higher bandwidth, so it should win anyway. From what I've read (mostly on ars), Altivec code is typically memory bandwidth bound, although this may be an exception.
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Feb 2001
Location: Orange County, California
Status:
Offline
|
|
Originally posted by Catfish_Man:
That doesn't make much sense to me. Altivec, even on the G5, can't handle 64 bit numbers, so I can't see why one would convert the data to 64 bit. If I had to guess, one of two things is happening: a) the dispatch restrictions on the G5's altivec unit (identical to an original G4) are slowing it down, or b) it uses G4-specific streaming instructions that can cause slowdowns on the G5. In general, the Altivec unit on the G4 is slightly superior, but the G5 has a higher clock frequency and much higher bandwidth, so it should win anyway. From what I've read (mostly on ars), Altivec code is typically memory bandwidth bound, although this may be an exception.
Here's a convo I had with him:
Code:
ItsIllak suggests a double speed core by using 2x32bit numbers side by side in a 64. But
I suck at asm/maths, so that's probably just dumb ;)
kakace: lol
bighead: i was thinking the same thing, but i figured someone would tell me to sit in the
corner with a dunce cap
kakace: ItsIllak: You're quite close to the solution
ertyu-: that requires hardware support, or lots of extra ops which make it useless
ItsIllak: I'd assume overflows would be a problem though...
kakace: but it's much better to write a 3-stages pipelined core using 2 Altivec stages and
1 integer stages interleaved
ItsIllak: kakace: sorry, I know your not new here, maybe you just don't know me very well
though, I'm never close to anything useful or intelligent. It's kinda like a policy of mine ;)
ItsIllak: kakace: see? What you just said makes no sense to me, and since I know nothing
about an altivec (or it's stage), there's no helping me. :)
bighead: i think he said that basically there will be multiple keys being tested at various
points in the cpu while others are waiting or doing something else
bighead: but i'm having difficulty parsing as well
ItsIllak gave up ASM programming as soon as he passed the university module. I'm not
sure pipelines contained anything other than oil in those days.
kakace: bighead: that might help if I tell you G4 cores are 2-stages pipelined cores : the
integer units process 4 keys upto a point, then Altivec finish the work on the next iteration
kakace: said otherwise, the main loop processes 4 keys using the integer units and 4 keys
using the Altivec units
kakace: the G5 needs an extra Altivec stage to interleave vector instructions in order to
work around the 2 cycles instructions latency that cause the current core to be so slow
|
|
The Bighead
- MacBook Pro 15" Matte non-unibody 2.6 GHz, 4GB RAM, 120/SSD & 1TB/5400
- PM G4 Dual 1.25 GHz, 2 GB RAM, 1x1TB Boot - 1x2TB TM Backup - 2x3TB Archive/Backup
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Oct 2000
Location: Los Angeles
Status:
Offline
|
|
I have to agree in general with Catfish. The improved Altivec unit in the newer 74xxs was not that much faster than the original one, so that difference cannot account for why a certain piece of code would run poorly on much higher clocked G5s.
|

"The natural progress of things is for liberty to yield and government to gain ground." TJ
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Feb 2001
Location: Orange County, California
Status:
Offline
|
|
Originally posted by Big Mac:
I have to agree in general with Catfish. The improved Altivec unit in the newer 74xxs was not that much faster than the original one, so that difference cannot account for why a certain piece of code would run poorly on much higher clocked G5s.
If you read the convo, you'd see it is because of the latency in the G5 that isn't present in the G4. The workaround would require a hardware addition to the G5.
That said, look at the speed comparison of the 1.5 GHz G4 (in the PowerBooks) and the 2.0 GHz G5 on the distributed.net speeds page. The 1.5 GHz G4 clocks at about 15.95Mkeys/sec in RC5-72, while the 2.0 GHz G5 gets about 15.06Mkeys/sec in the same contest.
If you want to look at the code for the un-enabled G5 core for RC5-72, it is available in the source tree at distributed.net's website. Also, if you want to quiz kakace about the G5 limitations, stop by #distributed on irc.distributed.net. He'll be more than happy to tell you that a G5 core isn't going to happen anytime soon.
EDIT:
Sorry, I didn't see that the comment was posted 1 minute after I posted my entry. Sorry about that.
|
|
The Bighead
- MacBook Pro 15" Matte non-unibody 2.6 GHz, 4GB RAM, 120/SSD & 1TB/5400
- PM G4 Dual 1.25 GHz, 2 GB RAM, 1x1TB Boot - 1x2TB TM Backup - 2x3TB Archive/Backup
|
| |
|
|
|
 |
|
 |
|
Junior Member
Join Date: Oct 2004
Location: Cincinnati
Status:
Offline
|
|
why not just recompile the code?
The code is at: http://www.distributed.net/source/
I do not have a G5 yet and dont have access to one. Still using a TiBook.
When you recompile use the -mcpu=G5 -mtune=G5 -fast -faltivec
flags.. See http://developer.apple.com/documenta...e-Options.html
To compile there should be a make file and you can go in and add these flags inside the .configure file (should be obvious where to put them). I have compiled this code before for D.Net. Team MacNN (see forum group) had some heavily optimized clients you could try.
I havent rebuilt the client in about a year, but its a straightforward build process from what I can remember. You should see a lot better improvement by specificing the advanced compilier G5 flags.
|
|
|
| |
|
|
|
 |
|
 |
|
Junior Member
Join Date: Oct 2004
Location: Cincinnati
Status:
Offline
|
|
sorry, I noticed that this will not work for a full client, but still can be done for benchmarking purposes. Distributed.net has a disclaimer on their source page that not all the source code is available to the public.
|
|
|
| |
|
|
|
 |
|
 |
|
Registered User
Join Date: Jan 2003
Location: California
Status:
Offline
|
|
Originally posted by Catfish_Man:
That doesn't make much sense to me. Altivec, even on the G5, can't handle 64 bit numbers, so I can't see why one would convert the data to 64 bit. If I had to guess, one of two things is happening: a) the dispatch restrictions on the G5's altivec unit (identical to an original G4) are slowing it down, or b) it uses G4-specific streaming instructions that can cause slowdowns on the G5. In general, the Altivec unit on the G4 is slightly superior, but the G5 has a higher clock frequency and much higher bandwidth, so it should win anyway. From what I've read (mostly on ars), Altivec code is typically memory bandwidth bound, although this may be an exception.
AltiVec does not work in terms of "numbers". It works with vectors, which are always 128-bit, whether you are on the G4 and the G5.
AltiVec on the G5 is not worse than AltiVec on G4! Where did you get that? There are differences, but due to the vast difference between the G4 and the G5 architectures. Even if you were to compare in terms of "better" or "worse", AltiVec on the G5 is better, or in some cases, the same as that on the G4 (I say same because there are some limits that are similar on both -- like what kind of instructions you can send each cycle, etc.)
There are some things on the G5 which are worse than the G4 but only when taken in isolation. Its memory latency is worse than the G4, but that's because the processor is so much faster than the memory now! Memory latency is in terms of processor cycles, so now the G5 has to wait longer, so to speak. There are some other very subtle "inferiorities" -- the G5 does some floating-point operations that are as precise as the floating-point standard requires, but no more. On the G4, these operations were over-precise.
Let me say it again: the G5's AltiVec is not a "whole lot worse" (as was said in some other post in this thread) than the G4's.
|
|
|
| |
|
|
|
 |
|
 |
|
Moderator Emeritus 
Join Date: Dec 2000
Location: College Park, MD
Status:
Offline
|
|
Originally posted by iohead:
AltiVec does not work in terms of "numbers". It works with vectors, which are always 128-bit, whether you are on the G4 and the G5.
AltiVec on the G5 is not worse than AltiVec on G4! Where did you get that? There are differences, but due to the vast difference between the G4 and the G5 architectures. Even if you were to compare in terms of "better" or "worse", AltiVec on the G5 is better, or in some cases, the same as that on the G4 (I say same because there are some limits that are similar on both -- like what kind of instructions you can send each cycle, etc.)
There are some things on the G5 which are worse than the G4 but only when taken in isolation. Its memory latency is worse than the G4, but that's because the processor is so much faster than the memory now! Memory latency is in terms of processor cycles, so now the G5 has to wait longer, so to speak. There are some other very subtle "inferiorities" -- the G5 does some floating-point operations that are as precise as the floating-point standard requires, but no more. On the G4, these operations were over-precise.
Let me say it again: the G5's AltiVec is not a "whole lot worse" (as was said in some other post in this thread) than the G4's.
The G5 altivec implementation is that of the 7400/7410. It does not contain the improvements that are in the 7450/7455, etc.
It is in no way superior to that of the current shipping G4s.
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: Aug 2001
Status:
Offline
|
|
Originally posted by iohead:
AltiVec does not work in terms of "numbers". It works with vectors, which are always 128-bit, whether you are on the G4 and the G5.
AltiVec on the G5 is not worse than AltiVec on G4! Where did you get that? There are differences, but due to the vast difference between the G4 and the G5 architectures. Even if you were to compare in terms of "better" or "worse", AltiVec on the G5 is better, or in some cases, the same as that on the G4 (I say same because there are some limits that are similar on both -- like what kind of instructions you can send each cycle, etc.)
There are some things on the G5 which are worse than the G4 but only when taken in isolation. Its memory latency is worse than the G4, but that's because the processor is so much faster than the memory now! Memory latency is in terms of processor cycles, so now the G5 has to wait longer, so to speak. There are some other very subtle "inferiorities" -- the G5 does some floating-point operations that are as precise as the floating-point standard requires, but no more. On the G4, these operations were over-precise.
Let me say it again: the G5's AltiVec is not a "whole lot worse" (as was said in some other post in this thread) than the G4's.
Strangely enough, vectors contain numbers! Altivec CANNOT work with 64 bit numbers. If it could, it would work with a vector of two 64 bit values, in the same way that SSE2 does. You are incorrect about the capabilities of the two chips being the same. iirc the 7400/7410 style vector unit can dispatch 2 instructions per cycle, one vperm and one other. The 7440/7450 style vector unit can dispatch 2 instructions per cycle, with no restrictions on where they go (vperm, vciu, vsiu, vfpu). The G5 also has longer instruction latencies, due to a number of design decisions that I can't remember right now. The arstechnica articles about the G5 detail the instruction latencies, iirc. Now in other areas, the G5's implementation is superior. It has vastly better memory bandwidth, more rename registers, better LSU bandwidth, supports out of order execution for the vector unit, and of course has higher clock frequencies. There are also areas that are merely different. Several of the G4's memory prefetching instructions (dcbt? I can't recall right now) slow down the G5 fairly significantly. However, the G5 has hardware prefetch support, which activates automatically. Overall, I would agree that the G5 is not "a whole lot worse" as a previous poster said. For this particular case, it appears that the G5 loses though, apparently due to some instruction latencies that couldn't be scheduled around.
|
|
|
| |
|
|
|
 |
|
 |
|
Junior Member
Join Date: Oct 2004
Location: Cincinnati
Status:
Offline
|
|
I watched a WWDC sessions 2004 about 32-bit vs 64-bit on Tiger, G5 architecture on ADC DVD. Some of the points that Apple engineers presented:
-PowerPC architecture has always been 64 bit and starting with Panther (OS X 10.3) with G5 this extra memory can be addressed with the 64 bit registers in G5.
- The PowerPC architecture's instructions do not have separate instructions for 32-bit, 64 bit. There is just a controlling flag that can be set to enable looking at the extra 32 bits. It is not a mode, just a way to mark whether the process is a 32-bit or 64-bit process. Apple is going to enforce that you do not have a 32-bit and 64-bit code running together in the same process. This is done by using this flag. THis also is different than using long long type to use 64 bt math. This flag is for all the code.
- Apple does not endorse running/compiling to 64 bits unless you need the larger memory space > 4 GB.
- Apple has said the 32 bit applications run faster than equiv 64 bit programs because of L1/L2 cache misses go up because compiled 64-bit programs are larger physically in size and will not fit in cache as well. All pointers are 8 bytes when compiled in 64-bits. Also, PowerPC does not use different instructions for 64 bit. It just does not ignore the high 32 bits.
- Altivec was not mentioned in the engineering presentation I watched.
- In XCode, there is a new architecture ppc64 type you use when you compile to make a full 64 bit app.
- The kernel is still going to be 32 bit when Tiger is shipped, because 32-bits run faster than equiv 64 bits and that way all Apple computers can run the same kernel and there will be only one
verion of Tiger.
- Use of "fat binaries" that can put both a 32-bit and 64-bit binaries in the same file and "system" can use the right binary based on if you have a G5 (or later).
- There are other source -code level things you need to be aware of and change (pitfals) such as in printf, you have to be more careful and not use %d, better use of casting etc..
Thats about all I can say without breaking NDA, but most of my points are already documented in the PowerPC arch spec.
|
|
|
| |
|
|
|
 |
|
 |
|
Dedicated MacNNer
Join Date: May 2001
Location: Front of my Intel iMac 20"
Status:
Offline
|
|
WOW!!! All this makes things a little clear...
does it?
Still I think there should be some way to increase performance. Even in terms of raw Mhz my Dual 2.0 G5 is more than twice my Dual 800 G4.
Oh well.. let the crunching go on...
|
|
iMac Intel Core Duo 2.0 Ghz 20", 1.5 GB RAM, 250GB
iMac G5 2.0 Ghz 17", 512 MB RAM, 160GB
iPod Video 5G 60GB White
Mighty Mouse sucks - "Bought the Logitech 518 Gaming mouse"
USB 2.0 Hard Drive Sucked - "Bought a Firewire Hard Disk"
|
| |
|
|
|
 |
|
 |
|
Junior Member
Join Date: Oct 2004
Location: Cincinnati
Status:
Offline
|
|
We have fallen victim of being our own worst enemy. We have taken the DNet client to a so low of level that we are not letting the complier perform its own optimizations. Apple and FSF add new features all the time to gcc that our hardcoded client cannot see because we need to recompile.
DNet is a waste of time (flames here), I used to work on the new client and from a computer science standpoint is going to take years to find the key. Big deal, its just not the stress of having DNet get in your way or worrying about how much of 0.00002 percent of work that collectively that DNEt finished today. Do the math. It will take years to solve this.
You have a dual G5. You should be happy and tearing that **** up, having the time of your life. Not worrying about some contest that will never be finished by the time the G6 comes out.
SETI@Home is the same. I heard that we have never found any evidence of anything and this is a big waste of taxpayers money to run the labs.
Run stuff with your dual G5 that kicks ass like motion or Doom3.. who cares about cracking keys. Set up a cheap linux athlon cluster if you want to do that.
|
|
|
| |
|
|
|
 |
|
 |
|
Dedicated MacNNer
Join Date: May 2001
Location: Front of my Intel iMac 20"
Status:
Offline
|
|
was thinking the same CincyGamer. I never looked at the progress bar of Dnetc. I did yesterday. Sheesh what a waste of time......
I will let my G5 do something good from now...
|
|
iMac Intel Core Duo 2.0 Ghz 20", 1.5 GB RAM, 250GB
iMac G5 2.0 Ghz 17", 512 MB RAM, 160GB
iPod Video 5G 60GB White
Mighty Mouse sucks - "Bought the Logitech 518 Gaming mouse"
USB 2.0 Hard Drive Sucked - "Bought a Firewire Hard Disk"
|
| |
|
|
|
 |
 |
|
 |
|
|
|
|
|

|
|
 |
Forum Rules
|
 |
 |
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is Off
|
|
|
|
|
|
 |
 |
 |
 |
|
 |
|