Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Community > Team MacNN > Enhanced Optimized

Enhanced Optimized (Page 2)
Thread Tools
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Reply With Quote
May 22, 2006, 09:05 AM
 
OK, here's some baseline-data. My procedure: Using the reference-wu and the 5.13 SETI stock worker, I run the following command:

time ./setiathome_5.13_powerpc-apple-darwin -nographics

The time command at the beginning of the line reports the processing time required at the end of the task. On machines with multiple CPUs, I only run one instance, which is not necessarily reflective of real-world performance when running under boinc, but it's close enough. No processor-intensive tasks are run in parallel.

Power Mac G5 Quad 2.5GHz w/ 4GB PC2-4200E-444 RAM, 10.4.6:
real 202m41.208s (=12161.208s)
user 202m13.581s
sys 0m55.723s (all three from time command)
wu_cpu_time: 12147.410551 (from init_data.xml)

Power Mac G5 DP 2.5GHz w/ 5GB PC3200U-30330 RAM, 10.4.6:
real 242m26.379s (=14546.379s)
user 240m4.987s
sys 1m6.640s
wu_cpu_time: n/a (was reported as 0 in init_data.xml - don't know why, have to look into this)

Mac mini G4 1.42GHz w/ 1GB PC2700U-25330 RAM, 10.4.6:
real 537m31.119s (=32251.119s)
user 528m39.193s
sys 2m12.853s
wu_cpu_time: n/a

Power Mac G4 Digital Audio DP 533MHz w/ 768MB PC133-333 RAM, 10.4.5:
real 986m35.925s (=59195.925s)
user 979m6.251s
sys 2m57.985s
wu_cpu_time: n/a

NB: It appears that the wu_cpu_time reported in init_data.xml is now far more accurate than in pre-enhanced times.

HTH,

Ron
( Last edited by halimedia; May 23, 2006 at 04:39 AM. )
     
Todd Madson
Mac Elite
Join Date: Apr 2000
Location: Minneapolis, MN USA
Status: Offline
Reply With Quote
May 22, 2006, 09:32 AM
 
Checked this morning - the Pentium M XP machine, the Athlon box, and the G4 all still
have over 100+ blocks to crunch of the old style before I can switch them.

I already have Crunch3r's client ready to go when I switch.

G5 has been switched, looks around 1hr 40mins remaining right before I left the
house and the thing was already at 30-40%. More later.
     
Todd Madson
Mac Elite
Join Date: Apr 2000
Location: Minneapolis, MN USA
Status: Offline
Reply With Quote
May 22, 2006, 11:03 PM
 
Update: Very very weird. I've got a bunch of blocks at 100% but they WILL NOT
upload. They just sit there while more blocks get crunched. I've tried doing the
force communcations but nothing occurs.
     
lepetitmartien
Junior Member
Join Date: Feb 2006
Location: Paris, France, Europe, Earth, Sol
Status: Offline
Reply With Quote
May 23, 2006, 04:43 AM
 
On my side, a few WU more, no problem at all.
MacMusic.Org says "Hi all!" :)
G5 desktop 1.8, 900 MHz frontbus (2003 model)
Latest wisdom file for it on demand, just PM me :)
     
Thanar
Junior Member
Join Date: Feb 2002
Location: Kozani, Greece, EU
Status: Offline
Reply With Quote
May 23, 2006, 05:58 AM
 
If the communications has been deferred for a particular ammount of time, there is no way of forcing connection to the servers, you'll have to wait... Actually, there is a way, by modifying some xml files, but it's a bit tricky...
     
Todd Madson
Mac Elite
Join Date: Apr 2000
Location: Minneapolis, MN USA
Status: Offline
Reply With Quote
May 23, 2006, 08:16 AM
 
I set it so I only had 5 days of work and it finally took them.

On a 2.5 dual, we're getting around 11,2xx-11,4xx seconds per
block as seen here:

http://setiathome.berkeley.edu/resul...6496&offset=40

Not optimal, around three hours but certainly better than eight to twelve
hours which is just ridiculous in my mind.

Thanks, Boog for your assistance on making life a bit easier for us here!
( Last edited by Todd Madson; May 23, 2006 at 08:23 AM. )
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Reply With Quote
May 23, 2006, 03:01 PM
 
First comparative results - ref-wu w/ stock 5.13 worker vs. boog's b5 on G5 Quad:

stock 5.13 worker:
real 202m41.208s (=12161.208s)
user 202m13.581s
sys 0m55.723s
wu_cpu_time: 12147.410551s

boog's b5:
real 184m40.884s (=11080.884s)
user 183m51.573s
sys 0m21.703s
wu_cpu_time: 11011.876272s

That makes b5 1080.324s, roughly 18 minutes or 9% faster than the stock worker! Great job, boog!

NB: The result-files are not identical, but I'm not knowledgeable enough to judge whether the differences are within the realm of what can be considered normal. If anyone has more experience with this regard, please pm me and I'll gladly mail the results.
( Last edited by halimedia; May 23, 2006 at 03:47 PM. )
     
sdubz  (op)
Junior Member
Join Date: Jun 2006
Status: Offline
Reply With Quote
May 23, 2006, 05:55 PM
 
Originally Posted by halimedia
First comparative results - ref-wu w/ stock 5.13 worker vs. boog's b5 on G5 Quad:

stock 5.13 worker:
real 202m41.208s (=12161.208s)
user 202m13.581s
sys 0m55.723s
wu_cpu_time: 12147.410551s

boog's b5:
real 184m40.884s (=11080.884s)
user 183m51.573s
sys 0m21.703s
wu_cpu_time: 11011.876272s

That makes b5 1080.324s, roughly 18 minutes or 9% faster than the stock worker! Great job, boog!

NB: The result-files are not identical, but I'm not knowledgeable enough to judge whether the differences are within the realm of what can be considered normal. If anyone has more experience with this regard, please pm me and I'll gladly mail the results.
Sweet! I'm going to run a test on another build I did.

I spent alot of time trying to get everything to compile with gcc 4.0 instead of 3.3 and I'm having issues with it still.

Then I noticed that my hdd was almost full on my mini, do I yanked my dvd burner out of it's firewire case to add a 100gig drive, so, I have spent alot of time moving everything and getting it all set up again.

I just started my latest test, we can compare it to the 1.42 mini you have above, and the stock worker here on mine wich finished in 31109.065713 seconds.

My other build finished in 29799.417300 wich is still only 9% faster on my mini, so it looks like the best increase I can achieve for now.
( Last edited by boog; May 24, 2006 at 06:02 AM. )
     
beadman
Dedicated MacNNer
Join Date: Nov 2004
Location: Virginia
Status: Offline
Reply With Quote
May 23, 2006, 08:05 PM
 
Perhaps one of you can tell me where my thinking is wrong. I have a MacBookPro with following info:
Machine Name: MacBook Pro 15" with OSX 10.4.6
Machine Model: MacBookPro1,1
CPU Type: Intel Core Duo
Number Of Cores: 2
CPU Speed: 2 GHz
L2 Cache (shared): 2 MB
Memory: 2 GB
Bus Speed: 667 MHz
Boot ROM Version: MBP11.0055.B02

I'm running the standard BOINC client as BOINC Manager with the standard SETI client, setiathome_5.13_i686-apple-darwin. Here's an example of the results: http://setiathome.berkeley.edu/worku...?wuid=78593724 I'm the 2414031 user.

Both of the PCs running this WU are Intels with dual processors, and 3 GHz or faster CPUs; my processor is only a 2.0 GHz duo-core, I have processed the same unit more than three times faster than one of the PCs and in two-thirds of the time of the other. My question is: Why and How? I've checked several other WU, and I'm continually faster that the PCs even though they have faster clock speed. Don't get me wrong, I'm not complaining, but I also am not intending to brag; I honestly want to know how this machine is able to do the same WU in so much less time, using the standard BOINC and SETI client. I've been running Rick and Alex's SETI client on my G4 PowerBook and my G4 iBook, and and really enjoyed the speed differential there, but this machine is using the standard versions and seems to go so much faster than the PCs.

Any info appreciated.
Claude
     
jedimstr
Junior Member
Join Date: Nov 2003
Status: Offline
Reply With Quote
May 23, 2006, 11:33 PM
 
Originally Posted by beadman
Perhaps one of you can tell me where my thinking is wrong. I have a MacBookPro with following info:
Machine Name: MacBook Pro 15" with OSX 10.4.6
Machine Model: MacBookPro1,1
CPU Type: Intel Core Duo
Number Of Cores: 2
CPU Speed: 2 GHz
L2 Cache (shared): 2 MB
Memory: 2 GB
Bus Speed: 667 MHz
Boot ROM Version: MBP11.0055.B02

I'm running the standard BOINC client as BOINC Manager with the standard SETI client, setiathome_5.13_i686-apple-darwin. Here's an example of the results: http://setiathome.berkeley.edu/worku...?wuid=78593724 I'm the 2414031 user.

Both of the PCs running this WU are Intels with dual processors, and 3 GHz or faster CPUs; my processor is only a 2.0 GHz duo-core, I have processed the same unit more than three times faster than one of the PCs and in two-thirds of the time of the other. My question is: Why and How? I've checked several other WU, and I'm continually faster that the PCs even though they have faster clock speed. Don't get me wrong, I'm not complaining, but I also am not intending to brag; I honestly want to know how this machine is able to do the same WU in so much less time, using the standard BOINC and SETI client. I've been running Rick and Alex's SETI client on my G4 PowerBook and my G4 iBook, and and really enjoyed the speed differential there, but this machine is using the standard versions and seems to go so much faster than the PCs.

Any info appreciated.
Claude

The Pentium 4 and the Pentium D are not as efficient with SSE2/SSE3 instructions as the Core Duo. Also the memory bandwidth and L1/L2 Cache efficiency for the Core Duo is much greater than the Pentiums. Since even the new standard Seti Enhanced apps are already optimized to use SSE, and since SETI has always been hungry for more memory cache speed, the Core Duo will beat a P4 or PD even at lower Ghz rates.

Another, albeit smaller factor, is that the Core Duo (based on the Pentium M and P3 model) can do much more per cycle than any netburst based (P4/PD) processor whose original purpose was to win the Ghz war even at the expense of doing less per clock tick.
----------------------------------------------------
Jedi's Lair: Reviews, Tips, and the RickyCam
----------------------------------------------------
Jedi's Photos: Living life one shutter click at a time...
     
Gecko_r7
Forum Regular
Join Date: Oct 2005
Location: Las Vegas, NV
Status: Offline
Reply With Quote
May 24, 2006, 08:13 PM
 
HI Boog,

Here 's the results of the reference unit on my G4;

Machine Name: Power Mac G4
Machine Model: PowerMac3,6
CPU Type: PowerPC G4 (3.3)
Number Of CPUs: 2
CPU Speed: 1.33 GHz
L2 Cache (per CPU): 256 KB
L3 Cache (per CPU): 2 MB
Memory: 1.5 GB CL 2.5
Bus Speed: 167 MHz


Stock 5.13 Worker
real 499m12.054s ( = 29952.054s)
user 498m11.342s
sys 0m57.878s
wu_cpu_time = 29915.352618


Boog's b5
real 475m24.735s ( = 28524.735s)
user 474m26.235s
sys 0m53.655s
wu_cpu_time = 58388.782092 (???)

Looks like 4.75% faster on the G4, which is better than a
I'm going to summize that my small 256K L2 cache, higher L3 latency, slow FSB, and architectural differences vs. those big and speedy G5s account for the difference. Got to believe the G5 can make more efficient use of Boog's improvements. Gone are the days when G4s ruled Mac-earth....
The wheels are turning. Let's see how fast we can rev this up!
Count on me for more help.
( Last edited by Gecko_r7; May 24, 2006 at 08:43 PM. )
     
sdubz  (op)
Junior Member
Join Date: Jun 2006
Status: Offline
Reply With Quote
May 24, 2006, 08:52 PM
 
Originally Posted by Gecko_r7
HI Boog,

Here 's the results of the reference unit on my G4;

Machine Name: Power Mac G4
Machine Model: PowerMac3,6
CPU Type: PowerPC G4 (3.3)
Number Of CPUs: 2
CPU Speed: 1.33 GHz
L2 Cache (per CPU): 256 KB
L3 Cache (per CPU): 2 MB
Memory: 1.5 GB CL 2.5
Bus Speed: 167 MHz


Stock 5.13 Worker
real 499m12.054s ( = 29952.054s)
user 498m11.342s
sys 0m57.878s
wu_cpu_time = 29915.352618


Boog's b5
real 475m24.735s ( = 28524.735s)
user 474m26.235s
sys 0m53.655s
wu_cpu_time = 58388.782092 (???)

Looks like 4.75% faster on the G4, which is better than a
I'm going to summize that my small 256K L2 cache, higher L3 latency, slow FSB, and architectural differences vs. those big and speedy G5s account for the difference. Got to believe the G5 can make more efficient use of Boog's improvements. Gone are the days when G4s ruled Mac-earth....
The wheels are turning. Let's see how fast we can rev this up!
Count on me for more help.
I think it is the L2 cache being the bottleneck, I have heard that it is what makes the faster cpu's crunch faster, my mini is:

Machine Name: Mac mini
Machine Model: PowerMac10,1
CPU Type: PowerPC G4 (1.1)
Number Of CPUs: 1
CPU Speed: 1.42 GHz
L2 Cache (per CPU): 512 KB
Memory: 1 GB
Bus Speed: 167 MHz

This was originally a 1.25ghz model.
     
beadman
Dedicated MacNNer
Join Date: Nov 2004
Location: Virginia
Status: Offline
Reply With Quote
May 24, 2006, 10:43 PM
 
Originally Posted by jedimstr
The Pentium 4 and the Pentium D are not as efficient with SSE2/SSE3 instructions as the Core Duo. Also the memory bandwidth and L1/L2 Cache efficiency for the Core Duo is much greater than the Pentiums. Since even the new standard Seti Enhanced apps are already optimized to use SSE, and since SETI has always been hungry for more memory cache speed, the Core Duo will beat a P4 or PD even at lower Ghz rates.

Another, albeit smaller factor, is that the Core Duo (based on the Pentium M and P3 model) can do much more per cycle than any netburst based (P4/PD) processor whose original purpose was to win the Ghz war even at the expense of doing less per clock tick.
Makes sense - thanks for the info, jedimstr!
Claude
     
sdubz  (op)
Junior Member
Join Date: Jun 2006
Status: Offline
Reply With Quote
May 25, 2006, 08:26 PM
 
Originally Posted by boog
I'm going to run a test on another build I did.

I spent alot of time trying to get everything to compile with gcc 4.0 instead of 3.3 and I'm having issues with it still.

Then I noticed that my hdd was almost full on my mini, do I yanked my dvd burner out of it's firewire case to add a 100gig drive, so, I have spent alot of time moving everything and getting it all set up again.

I just started my latest test, we can compare it to the 1.42 mini you have above, and the stock worker here on mine wich finished in 31109.065713 seconds.

My other build finished in 29799.417300 wich is still only 9% faster on my mini, so it looks like the best increase I can achieve for now.
Here is the latest build that I did, it actually was 9.5% faster on my mini, but I think I am stuck at this point and may have to wait on Alex and Rick to give us more.

http://boog.is-a-geek.org/seti/seti_enhanced_g4_b7.tgz
http://boog.is-a-geek.org/seti/seti_enhanced_g5_b7.tgz
     
arkayn
Dedicated MacNNer
Join Date: Aug 2005
Location: Golden Valley, AZ
Status: Offline
Reply With Quote
May 25, 2006, 09:25 PM
 
I will get it going on my eMac then.
     
TiloProbst
Junior Member
Join Date: Jul 2005
Status: Offline
Reply With Quote
May 25, 2006, 09:43 PM
 
ok both my Single 1.8 G5 and my Dual 2.0 G5 ran out of work a few days ago.

how fast are those enhanced clients compared to the optimized PC clients?
I mean with rick/alex's clients my Dual G5 achieved something about 1200 RAC, outperforming a 3,8 GHz HT P4 (on a not-so-good-mainboard though). I'd like to take on on that level not to disappoint the team

and, more important, who could give me some short seti -> seti enhanced transition instructions for my G5s?
     
sdubz  (op)
Junior Member
Join Date: Jun 2006
Status: Offline
Reply With Quote
May 25, 2006, 09:57 PM
 
Originally Posted by TiloProbst
ok both my Single 1.8 G5 and my Dual 2.0 G5 ran out of work a few days ago.

how fast are those enhanced clients compared to the optimized PC clients?
I mean with rick/alex's clients my Dual G5 achieved something about 1200 RAC, outperforming a 3,8 GHz HT P4 (on a not-so-good-mainboard though). I'd like to take on on that level not to disappoint the team

and, more important, who could give me some short seti -> seti enhanced transition instructions for my G5s?

I doubt these will give you those kind of numbers, but should be better than stock. Hopefully Alex and Rick will come out with something more impressive!

All you should need to do it unzip the tgz file and put the files in your startup disk>library>application support>boinc data>projects>setiathome.berkeley.edu folder and fire boinc back up!
     
Gecko_r7
Forum Regular
Join Date: Oct 2005
Location: Las Vegas, NV
Status: Offline
Reply With Quote
May 25, 2006, 10:02 PM
 
Boog, Thanks for posting b7. I've got it on the G4 now and will run it w/ Ref WU 2-night.
Hope that you, Rick and Alex will be able to collaberate some soon.
Regards,
Ian
     
arkayn
Dedicated MacNNer
Join Date: Aug 2005
Location: Golden Valley, AZ
Status: Offline
Reply With Quote
May 26, 2006, 06:40 AM
 
b7 gave me nothing but unrecoverable results.
     
sdubz  (op)
Junior Member
Join Date: Jun 2006
Status: Offline
Reply With Quote
May 26, 2006, 06:45 AM
 
Originally Posted by arkayn
b7 gave me nothing but unrecoverable results.

Odd, what were the errors?

I'm using b7 right now and I havn't had any issues.
What hardware?
     
Knightrider
Dedicated MacNNer
Join Date: Sep 2004
Location: London
Status: Offline
Reply With Quote
May 26, 2006, 06:52 AM
 
Originally Posted by arkayn
b7 gave me nothing but unrecoverable results.
Did you dry out the SAH wu's first? Before installing a new worker its best to finish existing wu with the current worker. Just select 'No new tasks' in the projects tab. Then, when you have no more wu's, exit boinc and install the new worker. Restart boinc and allow new work.

K.
     
rhettmaxwell
Fresh-Faced Recruit
Join Date: Jun 2006
Status: Offline
Reply With Quote
May 26, 2006, 07:51 AM
 
Originally Posted by arkayn
b7 gave me nothing but unrecoverable results.
I also got a problem:

http://setiathome.berkeley.edu/resul...ltid=332120708
     
Todd Madson
Mac Elite
Join Date: Apr 2000
Location: Minneapolis, MN USA
Status: Offline
Reply With Quote
May 26, 2006, 08:00 AM
 
Preliminary showings on B5_A7 is in the 2 hrs 58 minutes area for work on a G5 2.5 dual.
     
lepetitmartien
Junior Member
Join Date: Feb 2006
Location: Paris, France, Europe, Earth, Sol
Status: Offline
Reply With Quote
May 26, 2006, 08:55 AM
 
Going to try g5_b7 once the current WU is done (in about one hour)
MacMusic.Org says "Hi all!" :)
G5 desktop 1.8, 900 MHz frontbus (2003 model)
Latest wisdom file for it on demand, just PM me :)
     
TiloProbst
Junior Member
Join Date: Jul 2005
Status: Offline
Reply With Quote
May 26, 2006, 10:12 AM
 
Originally Posted by boog
I doubt these will give you those kind of numbers, but should be better than stock. Hopefully Alex and Rick will come out with something more impressive!

All you should need to do it unzip the tgz file and put the files in your startup disk>library>application support>boinc data>projects>setiathome.berkeley.edu folder and fire boinc back up!
ok, is the file path really relevant?
currently my Boinc path is Mac HD/Applications/Boinc5/

I just replace the info.xml file and the worker, and then start the worker as usual with
/Applications/Boinc5/your_application -dir /Applications/Boinc5/ or what?

I use the Terminal to start up Boinc, no Manager involved, is that a problem?
how do I make the worker ..
.. reset?
.. update?
.. refuse new WUs?
.. force Benchmark?
.. display all possible commands? (old worker was: -help)
     
Gecko_r7
Forum Regular
Join Date: Oct 2005
Location: Las Vegas, NV
Status: Offline
Reply With Quote
May 26, 2006, 01:40 PM
 
No problems w/ b7 on G4. Crunching and validating fine. No errors.
     
arkayn
Dedicated MacNNer
Join Date: Aug 2005
Location: Golden Valley, AZ
Status: Offline
Reply With Quote
May 26, 2006, 04:47 PM
 
This was on the eMac.
     
sdubz  (op)
Junior Member
Join Date: Jun 2006
Status: Offline
Reply With Quote
May 26, 2006, 05:24 PM
 
Originally Posted by TiloProbst
ok, is the file path really relevant?
currently my Boinc path is Mac HD/Applications/Boinc5/

I just replace the info.xml file and the worker, and then start the worker as usual with
/Applications/Boinc5/your_application -dir /Applications/Boinc5/ or what?

Looks like all you will need to do is the app_info.xml and the new worker in your projects/seti directory and run it like normal
     
sdubz  (op)
Junior Member
Join Date: Jun 2006
Status: Offline
Reply With Quote
May 26, 2006, 05:27 PM
 
Originally Posted by arkayn
This was on the eMac.
I wonder if I went too cpu specific? Is the emac g4 cpu a 7450 like the mini and such? I could fix that for you if needed, just compile a new version a bit more generic (but really shouldn't make much of a difference in speed).
     
arkayn
Dedicated MacNNer
Join Date: Aug 2005
Location: Golden Valley, AZ
Status: Offline
Reply With Quote
May 26, 2006, 05:58 PM
 
It says that it is a 7457B processor, the same that is in the Mac Mini.
     
arkayn
Dedicated MacNNer
Join Date: Aug 2005
Location: Golden Valley, AZ
Status: Offline
Reply With Quote
May 26, 2006, 05:59 PM
 
I will see what happens later today when I am allowed to download some more work on the eMac. Luckily the other 2 computers are still cranking out the units.
     
lepetitmartien
Junior Member
Join Date: Feb 2006
Location: Paris, France, Europe, Earth, Sol
Status: Offline
Reply With Quote
May 26, 2006, 09:12 PM
 
First WU with the G5_b7, did it perfectly.
MacMusic.Org says "Hi all!" :)
G5 desktop 1.8, 900 MHz frontbus (2003 model)
Latest wisdom file for it on demand, just PM me :)
     
alexkan
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Reply With Quote
May 27, 2006, 03:31 AM
 
Alright, alright, you've waited long enough--it's not like I haven't been working on this, so here's a taste of what I've got so far. This is far from being a finished product (work-wise and speed-wise), but I have tested the optimizations I've put in, and thus far have no reason to believe that I've done anything that will break validation.

http://tbp.berkeley.edu/~alexkan/set...ced-ppc-v2.zip

Why v2, you ask? These version numbers are more for me than they are for you, since now I'm doing a better job of keeping track of which optimizations are in which versions. For reference, I vectorized pulse-finding in v1, and partially vectorized Gaussian-finding in v2. As far as optimizations go (excluding one part of pulse-finding), this doesn't cover any significant new ground compared to our older optimized workers, so they're mostly low-hanging fruit.

The source is available at http://tbp.berkeley.edu/~alexkan/seti/src-v2.tar.bz2. If you can compile the nightly tarballs, you can probably compile these as well. (Look at pulsefind.cpp and gaussfit.cpp if you want to see what I've done.) I haven't modified the compile flags other than to enable Altivec support, so you're welcome to try some tweaks, but I will probably be tuning my Altivec functions by hand to make them faster if I run out of other optimizations to make, and this may or may not interact well with heavy compiler optimization.

Also, if you enjoyed generating wisdom last time, or if you want the absolute best performance with any current Mac worker (stock, boog's, mine, or anyone else's), run fft_test3 and dump the resulting wisdom.sah file in the same folder with everything else. SETI has probably generated one on its own, but it uses less thorough planning, which means slower FFTs, so overwrite it if you're asked.

Rick will have a GUI wisdom generator out at some point, for those of you less inclined to play with the CLI.
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Reply With Quote
May 27, 2006, 04:43 AM
 
Excellent news, Alex! Thanks a million for your efforts!! Let's show the world again what PPC can do!!

Already crunching the ref-wu on the quad with new wisdom and your v2. Can't wait to see the results!!

More later...
     
liebsmaschine
Registered User
Join Date: Jul 2006
Status: Offline
Reply With Quote
May 27, 2006, 05:56 AM
 
Originally Posted by alexkan
Also, if you enjoyed generating wisdom last time, or if you want the absolute best performance with any current Mac worker (stock, boog's, mine, or anyone else's), run fft_test3 and dump the resulting wisdom.sah file in the same folder with everything else. SETI has probably generated one on its own, but it uses less thorough planning, which means slower FFTs, so overwrite it if you're asked.
There wasn't a wisdom.sah file in the setiathome directory, even though I've been running Enhanced for at least a week now. I did a search, and there *are* two wisdom.sah files--one in [boinc directory]/slots/0/wisdom.sah and [boinc directory]/slots/3/wisdom.sah. Should I replace those with the wisdom file generated from fft_test3, or should I put it in the [bd]/projects/setiathome/ directory with the worker app? Does your compile look for the wisdom file in a different place than the stock worker?

Also, how much of a difference will it make to generate the wisdom file in single user mode (or at least with an absolute minimum of processes running, perhaps after a fresh restart) versus just doing it casually while using the computer? I know in the other thread that many were generating it from single-user mode, as it supposedly helped the benchmarking be more accurate. Is it worth that?
( Last edited by jackal; May 27, 2006 at 06:03 AM. )
     
arkayn
Dedicated MacNNer
Join Date: Aug 2005
Location: Golden Valley, AZ
Status: Offline
Reply With Quote
May 27, 2006, 06:20 AM
 
B7 gave me another 10 dead units again last night.

I will now test alexkan's when I am able to get more units for SETI again, about midnite my time.
     
sdubz  (op)
Junior Member
Join Date: Jun 2006
Status: Offline
Reply With Quote
May 27, 2006, 06:52 AM
 
Originally Posted by alexkan
Alright, alright, you've waited long enough--it's not like I haven't been working on this, so here's a taste of what I've got so far. This is far from being a finished product (work-wise and speed-wise), but I have tested the optimizations I've put in, and thus far have no reason to believe that I've done anything that will break validation.

Awesome! I'm deffinately going to play with these, and check out what you have changed to see what I can learn! Thanks, man!

/me is excited....goes to play....
     
sdubz  (op)
Junior Member
Join Date: Jun 2006
Status: Offline
Reply With Quote
May 27, 2006, 06:55 AM
 
Originally Posted by arkayn
B7 gave me another 10 dead units again last night.

I will now test alexkan's when I am able to get more units for SETI again, about midnite my time.

Sorry, arkayn. I'm truely not sure why it isn't working on your machine. I have access to a slightly older g4 at work that I will try to test on as well when I get a chance (I'm out in the field mostly).

But with alex bringing something that I'm sure will be better than mine, you should be good to go, or revert to my version b5 on that machine.
     
gulliver
Fresh-Faced Recruit
Join Date: Nov 2005
Location: Europe
Status: Offline
Reply With Quote
May 27, 2006, 07:07 AM
 
Originally Posted by jackal
There wasn't a wisdom.sah file in the setiathome directory, even though I've been running Enhanced for at least a week now. I did a search, and there *are* two wisdom.sah files--one in [boinc directory]/slots/0/wisdom.sah and [boinc directory]/slots/3/wisdom.sah. Should I replace those with the wisdom file generated from fft_test3, or should I put it in the [bd]/projects/setiathome/ directory with the worker app? Does your compile look for the wisdom file in a different place than the stock worker?

Also, how much of a difference will it make to generate the wisdom file in single user mode (or at least with an absolute minimum of processes running, perhaps after a fresh restart) versus just doing it casually while using the computer? I know in the other thread that many were generating it from single-user mode, as it supposedly helped the benchmarking be more accurate. Is it worth that?
Ok, so I created a file called "wisdom.sah" as well. What do I do with it? Shall I rename it to "bigfft_wisdom" and where to put it? Same place as the old one, into /Library/Application Support/BOINC Data/projects/setiathome.berkeley.edu?

BTW: What do these wisdom files do and what are the good for?

Thanks!
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Reply With Quote
May 27, 2006, 07:21 AM
 
Alex' v2 results crunching the ref-wu on a quad-G5:
real 185m5.111s
user 184m56.059s
sys 0m9.116s
wu_cpu_time: 11102.131903

Same ballpark as boog's b5 - roughly 10% faster than the stock worker. Keep 'em coming!
( Last edited by halimedia; May 27, 2006 at 08:46 AM. )
     
TiloProbst
Junior Member
Join Date: Jul 2005
Status: Offline
Reply With Quote
May 27, 2006, 09:20 AM
 
concerning the renaming, Alex quote from the Optimized thread:
"SETI Enhanced does something similar to what I did with alpha-5.2 and alpha-5.3, meaning that it loads and stores wisdom (in a file called wisdom.sah, rather than bigfft_wisdom), but it does this on-the-fly, and with less thorough (and therefore much less optimal) planning. Don't try anything like renaming your copies of bigfft_wisdom, as tempting as it might sound [...]"

Originally Posted by alexkan
Also, if you enjoyed generating wisdom last time, or if you want the absolute best performance with any current Mac worker (stock, boog's, mine, or anyone else's), run fft_test3 and dump the resulting wisdom.sah file in the same folder with everything else.
thanks for your efforts can't wait to kick some x86's butt. Again.

following good traditions in the past, I will try to make my generated wisdom files for my two macs avaible:
2x2.0 GHz G5 Mac OSX 10.4.6 1GB RAM (non dual core)
1x1.8 GHz G5 Mac OSX 10.4.2 512 MB RAM 900 MHz FSB (2003)

EDIT: ok generation on Single 1,8 took ~30 minutes, that is way faster than fftw_test2

what version of BOINC am I supposed to use with your worker? still 5.2.13 just like before enhanced?

how do I make YOUR worker ..
.. reset?
.. update?
.. refuse new WUs?
.. force Benchmark?
.. display all possible commands? (old worker was: -help)
( Last edited by TiloProbst; May 27, 2006 at 09:56 AM. )
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Reply With Quote
May 27, 2006, 10:43 AM
 
Tilo, you seem to confuse the seti worker with the boinc client. BTW, there's a Superbench-build of the 5.4.9 CLI boinc client available here (beta): http://members.dslextreme.com/~reade...boincbeta.html

Edit: Superbench does not seem all that important anymore when crunching SETI, it seems. The new FLOP-based credit calculation scheme employed in SETI enhanced appears to claim identical credits w/ or w/o Superbench client. So if you're crunching only SETI, it's just as well to dl the client of your choice at the BOINC-site. 5.4.9 is worth having (a number of new features and many bugfixes), but 5.2.13 is still supported.


HTH,

Ron
( Last edited by halimedia; May 27, 2006 at 11:37 AM. )
     
gulliver
Fresh-Faced Recruit
Join Date: Nov 2005
Location: Europe
Status: Offline
Reply With Quote
May 27, 2006, 11:51 AM
 
Originally Posted by TiloProbst
concerning the renaming, Alex quote from the Optimized thread:
"SETI Enhanced does something similar to what I did with alpha-5.2 and alpha-5.3, meaning that it loads and stores wisdom (in a file called wisdom.sah, rather than bigfft_wisdom), but it does this on-the-fly, and with less thorough (and therefore much less optimal) planning. Don't try anything like renaming your copies of bigfft_wisdom, as tempting as it might sound [...]"
and
<div style="margin:20px; margin-top:5px; ">
<div class="smallfont" style="margin-bottom:2px">Quote:</div>
<table cellpadding="6" cellspacing="0" border="0" width="100%">
<tr>
<td class="alt2" style="border:1px inset">
<div>Originally Posted by <strong>alexkan</strong></div>
<div style="font-style:italic">Also, if you enjoyed generating wisdom last time, or if you want the absolute best performance with any current Mac worker (stock, boog's, mine, or anyone else's), run <a href="http://tbp.berkeley.edu/~alexkan/seti/fft_test3.zip" target="_blank">fft_test3</a> and dump the resulting wisdom.sah file in the same folder with everything else. </div>
</td>
</tr>
</table>
</div>
Hi Tilo,

unfortunately this information does not help. I have a bigfft_wisdom file in /Library/Application Support/BOINC Data/projects/setiathome.berkeley.edu and several wisdom.sah files in different slots here /Library/Application Support/BOINC Data/slots. Now, where do I put the newly generated wisdom.sah???
     
alexkan
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Reply With Quote
May 27, 2006, 12:02 PM
 
gulliver:
You should put wisdom.sah in /Library/Application Support/BOINC Data/projects/setiathome.berkeley.edu. The multiple wisdom.sah files in /Library/Application Support/BOINC Data/slots are the ones that SETI generates on the fly.

TiloProbst:
Use whatever BOINC client supports FLOP counting, since that's the fairest and most consistent way to assign credit. As for the other things you asked about, that sounds more like the domain of the BOINC client than of the SETI worker, so I can't help you there.

--

Having just finished profiling the worker on the reference work unit, I can tell you that there is a lot of room left for improvement on G4, and probably even more on G5. However, I don't know how long I'll have access to G5s now that I've graduated, so at some point I will (again) be calling on you to help me profile the worker so I know what to tweak.
     
gulliver
Fresh-Faced Recruit
Join Date: Nov 2005
Location: Europe
Status: Offline
Reply With Quote
May 27, 2006, 12:29 PM
 
Originally Posted by alexkan
gulliver:
You should put wisdom.sah in /Library/Application Support/BOINC Data/projects/setiathome.berkeley.edu. The multiple wisdom.sah files in /Library/Application Support/BOINC Data/slots are the ones that SETI generates on the fly.

TiloProbst:
Use whatever BOINC client supports FLOP counting, since that's the fairest and most consistent way to assign credit. As for the other things you asked about, that sounds more like the domain of the BOINC client than of the SETI worker, so I can't help you there.

--

Having just finished profiling the worker on the reference work unit, I can tell you that there is a lot of room left for improvement on G4, and probably even more on G5. However, I don't know how long I'll have access to G5s now that I've graduated, so at some point I will (again) be calling on you to help me profile the worker so I know what to tweak.
alex,

thanks! Congratulations for your graduation!!! Hope you can still optimize the G5 client.
     
Gecko_r7
Forum Regular
Join Date: Oct 2005
Location: Las Vegas, NV
Status: Offline
Reply With Quote
May 27, 2006, 03:15 PM
 
Originally Posted by alexkan
Having just finished profiling the worker on the reference work unit, I can tell you that there is a lot of room left for improvement on G4, and probably even more on G5. However, I don't know how long I'll have access to G5s now that I've graduated, so at some point I will (again) be calling on you to help me profile the worker so I know what to tweak.
Hi Alex: Congratulations on graduating! What did you degree if you don't mind me asking?
BTW, installed the new wisdom.sah and am running V2 on the G4. Just for grins and giggles, I will run against ref. WU. Excited to hear that you see lots of headroom for improvement w/ G4.
Been tough to see my M 1.6 (Banias) laptop crunching 2 units consecutively, FASTER than my dual 1.33 can do 1 on each CPU.
     
alexkan
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Reply With Quote
May 27, 2006, 04:50 PM
 
Originally Posted by Gecko_r7
Hi Alex: Congratulations on graduating! What did you degree if you don't mind me asking?
Electrical Engineering and Computer Science (it's really only one major). I spent more time on the CS side of things, though.
BTW, installed the new wisdom.sah and am running V2 on the G4. Just for grins and giggles, I will run against ref. WU. Excited to hear that you see lots of headroom for improvement w/ G4.
Been tough to see my M 1.6 (Banias) laptop crunching 2 units consecutively, FASTER than my dual 1.33 can do 1 on each CPU.
I'm looking forward to seeing how the benchmark numbers come out. Definitely saves me the trouble of timing the client myself, since that kind of slows down the pace of development. There are definitely more exciting things in the pipeline--hopefully you'll be able to see the results soon.
     
Knightrider
Dedicated MacNNer
Join Date: Sep 2004
Location: London
Status: Offline
Reply With Quote
May 27, 2006, 06:48 PM
 
A couple of initial validations HERE and HERE

Way Hey were on our way.

Thanks guys.

K.
     
liebsmaschine
Registered User
Join Date: Jul 2006
Status: Offline
Reply With Quote
May 27, 2006, 07:11 PM
 
Originally Posted by alexkan
gulliver:
You should put wisdom.sah in /Library/Application Support/BOINC Data/projects/setiathome.berkeley.edu. The multiple wisdom.sah files in /Library/Application Support/BOINC Data/slots are the ones that SETI generates on the fly.
That answered my question, too. Thanks.

Originally Posted by jackal
Also, how much of a difference will it make to generate the wisdom file in single user mode (or at least with an absolute minimum of processes running, perhaps after a fresh restart) versus just doing it casually while using the computer? I know in the other thread that many were generating it from single-user mode, as it supposedly helped the benchmarking be more accurate. Is it worth that?
I finally got fft_test3 to produce a wisdom file. I could not get it to work under single-user mode (yes, I had made the drive mountable)--it kept freezing after the 32768 step (even after leaving it running all night). I did it under regular multiple-user mode after a fresh restart and by logging directly into >console (so the GUI didn't take any processing power or memory). It worked fine (took about 45 minutes or so on my 1.67 PowerBook 17") In regards to whether single-user or multiple-user modes make a difference, here's what one step from the standard output from fft_test3 showed under the different conditions:

single-user mode: 32768 1123.474286 1233.618824 901.193539
multiple-user mode: 32768 1454.671908 2001.258370 1194.107900

I'm not exactly sure what these numbers mean, but assuming they're some measure of time for the fft test, single-user mode does make a difference for this test (although perhaps it doesn't make any difference for the [email protected] worker in the end). Unfortunately, I can't get fft_test3 to finish the test under single-user mode.

In any case, I'll post the wisdom.sah file I created. This was made on a "low-resolution" (second-to-latest revision) PowerBook 17" with 1GB of RAM and a 1.67GHz processor. I'll eventually come up with one for my older PowerMac G4 "Yikes" model with a 350MHz processor and post it here.

Wisdom file for PowerBook G4
     
alexkan
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Reply With Quote
May 27, 2006, 07:42 PM
 
Originally Posted by jackal
I finally got fft_test3 to produce a wisdom file. I could not get it to work under single-user mode (yes, I had made the drive mountable)--it kept freezing after the 32768 step (even after leaving it running all night). I did it under regular multiple-user mode after a fresh restart and by logging directly into >console (so the GUI didn't take any processing power or memory). It worked fine (took about 45 minutes or so on my 1.67 PowerBook 17") In regards to whether single-user or multiple-user modes make a difference, here's what one step from the standard output from fft_test3 showed under the different conditions:

single-user mode: 32768 1123.474286 1233.618824 901.193539
multiple-user mode: 32768 1454.671908 2001.258370 1194.107900

I'm not exactly sure what these numbers mean, but assuming they're some measure of time for the fft test, single-user mode does make a difference for this test (although perhaps it doesn't make any difference for the [email protected] worker in the end). Unfortunately, I can't get fft_test3 to finish the test under single-user mode.
The numbers that fft_test3 prints are indications of speed, not time. (Specifically, they're hypothetical measures of megaFLOPS, given the 5N log N convention for the number of floating-point operations for FFTs with N elements.) In that regard, it looks like you should be running in multiple-user mode.

Here's version 3. Like version 2, it is not a finished project, but this time, I did cover some new ground, so hopefully my discoveries will bear fruit. Timed runs of the reference work unit are always appreciated, so please post them if you do them.

http://tbp.berkeley.edu/~alexkan/set...ced-ppc-v3.zip

I'll upload the source when I go somewhere where my Internet is faster. It occurred to me that these compiles are all generated with -mtune=G4, so perhaps G5s aren't getting code that's scheduled as well as it could be. My future tweaking will aim to fix this, but once the source is up, people are welcome to try recompiling. Look in analyzeFuncs.cpp and analyzePoT.cpp to see what I've changed this time.
     
 
Thread Tools
 
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Top
Privacy Policy
All times are GMT -4. The time now is 05:12 PM.
All contents of these forums © 1995-2017 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.8 © 2000-2017, Jelsoft Enterprises Ltd.,