MacNN Forums (http://forums.macnn.com/)
-   Team MacNN (http://forums.macnn.com/team-macnn/)
-   -   New Altivec-enhanced Seti worker in need of testing (http://forums.macnn.com/72/team-macnn/266339/new-altivec-enhanced-seti-worker-need/)

 
alexkan Aug 12, 2005 03:02 AM
New Altivec-enhanced Seti worker in need of testing
Hi all,

Those of you who are subscribed to the boinc_opt mailing list may have seen a posting in late July about a new SETI@home worker for OS X that uses Apple's vDSP libraries. I've been helping the original author with some additional optimizations to the other "workhorse" functions, and we don't really have the resources (i.e. computing power) to do a lot of testing, so we'd like to see if people would be willing to try the new client as a replacement SETI worker and report back to us regarding whether or not results are validating.

The reason that this client needs more testing is that it incorporates more aggressive optimizations than other clients we've looked at. For one thing, we're using different FFT code (although some Windows compiles have replaced the standard Ooura FFT with code from the FFTW3 library), and we're using a different (and extremely fast) method to calculate chirped data. It seems to us that the amount of change introduced by our optimizations should be within the validation limits, but we need more real-world numbers to be sure.

For reference, I'm currently getting times of about 20500 seconds for the reference work unit on my Cube (G4 450). I've still got a couple tricks left to use either to speed up the code or improve its accuracy, and your feedback will help us to decide which of those ends we should be pursuing. Any help the members of this forum could offer would be greatly appreciated.

Thanks! :)


Alex

We have a website now!
http://writhe.org.uk/seti@home/

The most up-to-date version of the client for Tiger is: alpha-5/alpha-5.2. G5s have a newer optimized client than G4s, since the changes for alpha-5.2 actually hurt the G4s. Additionally, G5 owners will need to place a bigfft_wisdom file in the same directory as the worker to achieve optimal performance.

Note: the website is (currently) one version behind on the Tiger releases, at least until Rick updates it. The new clients can be found here:

G4: http://inst.eecs.berkeley.edu/~alexk...home-G4-a5.zip
G5: http://inst.eecs.berkeley.edu/~alexk...lpha-52-g5.tgz (also see E.T.\'s post on page 15, as well as the surrounding posts, to find the bigfft_wisdom file you need)

The most up-to-date version of the client for Panther is: alpha-4. These clients can be found on the website.
 
beadman Aug 12, 2005 09:42 AM
Alexkan:

Thanks for the info. I'll give it a try when i get home tonight.

beadman
 
Todd Madson Aug 12, 2005 03:58 PM
What was the amount of seconds used to process the reference work unit prior to having this
version of the worker? Is it a significant improvement over the conventional worker?

Please keep me posted on the G5 optimization too. Thanks.
 
alexkan Aug 12, 2005 04:20 PM
Quote, Originally Posted by Todd Madson
What was the amount of seconds used to process the reference work unit prior to having this
version of the worker? Is it a significant improvement over the conventional worker?

Please keep me posted on the G5 optimization too. Thanks.
At an earlier stage in the optimization process on the same client, with only the vDSP FFT functions in place (i.e. no optimized chirp function), my Cube was taking 9+ hours of processing to finish the reference work unit. I've run javalizard's client for comparison purposes only once, but I seem to recall it taking even longer. I can't afford to burn that much CPU time just to do speed tests because my computer is so slow--I usually run the worker only to test the accuracy of a new optimization, or to search for the next bottleneck in the code to optimize away. With what information I have, though, it seems like if this worker is accurate enough to pass validation, it will be a significant improvement over the conventional worker and even over javalizard's, which seems to be the worker of choice these days.

I don't actually have access to a G5 at the moment, so whatever re-optimization I do there should probably be tested by actual machines at some point. If any G5 owners running Tiger have the developer tools and CHUD installed, it would be nice if I could get Shark traces of a couple benchmarks that I've been running so I can figure out how best to rewrite the functions for the G5.
 
Shaktai Aug 12, 2005 06:28 PM
Okay, here are some preliminary results.

iMac G5 1.6 ghz w/ OS 10.4.2 (Tiger)
Average with 12 unit sample running Java Lizards app. 12468 seconds or 3.46 hours.

Using G4 7450/7455 on G5 average time over first 4 units is 8100 or 2.25 hours.
No obvious problems but still waiting for validation.

I'm also going to try it with the G4 7400/7410 app. I'll update when I have something.

This app is fast even running the G4 version on G5. If it validates, it will be very sweet.
 
alexkan Aug 12, 2005 06:47 PM
Quote, Originally Posted by Shaktai
Okay, here is some preliminary results.

iMac G5 1.6 ghz w/ OS 10.4.2 (Tiger)
Average with 12 unit sample running Java Lizards app. 12468 seconds or 3.46 hours.

Using G4 7450/7455 on G5 average time over first 4 units is 8100 or 2.25 hours.
No obvious problems but still waiting for validation.

I'm also going to try it with the G4 7400/7410 app. I'll update when I have something.

This app is fast even running the G4 version on G5. If it validates, it will be very sweet.
I wouldn't be surprised if the 7400/7410 compile works a little better on G5 than the 7450/7455 compile, since the G5's Altivec scheduling is more like the original G4's than the G4e's. Either way, the prefetching in chirp will be totally borked on G5 (10-cycle stall on the vec_dst call, plus the dcbz's will be more frequent and less effective than they need to be), so I'll try to get the G5 version out for you guys as soon as possible.

Those numbers look pretty encouraging. Then again, making sure the results validate is the most important thing, so let us know how that goes ASAP.
 
beadman Aug 12, 2005 10:40 PM
I tried the new SETI optimization on my PowerBook G4 1 GHz 1 GB RAM, running BOINC 4.44 SuperBench and the new SETI as CLI. Unfortunately for me, after I installed it this morning, I wasn't able to check the computer until just now. Every time BOINC tried to run SETI, it bombed. An example Terminal log entry:

2005-08-12 17:42:08 [SETI@home] Starting result 11no03aa.3961.8673.1022148.156_2 using setiathome version 4.18
2005-08-12 17:42:10 [SETI@home] Unrecoverable error for result 11no03aa.3961.8673.1022148.156_2 (process got signal 5)
2005-08-12 17:42:10 [SETI@home] Unrecoverable error for result 11no03aa.3961.8673.1022148.156_2 (process got signal 5)
2005-08-12 17:42:10 [---] request_reschedule_cpus: process exited
2005-08-12 17:42:10 [SETI@home] Deferring communication with project for 59 seconds
2005-08-12 17:42:10 [SETI@home] Deferring communication with project for 59 seconds
2005-08-12 17:42:10 [SETI@home] Computation for result 11no03aa.3961.8673.1022148.156_2 finished

BOINC tried with seven WU and all failed. I quit BOINC, removed the new SETI, and put the Javalizard version back in.

If you want more info than this, please don't hesitate to ask.

beadman :(
 
Shaktai Aug 12, 2005 11:49 PM
Wow Beadman. That is weird. Can you give a little more info on your setup and OS. Did you actually "replace" the app_info.xml file?

I ran and completed 4 units with the 7450 client without problem and that was on a G5. I have completed 1 with the 7400 client and one is nearing completion. So far no errors. The first 7400 unit was faster still, but that could have just been a fast work unit. I'll know more by morning. Oh, and they were all using mikkyo's optimized BOINC clients as well.

Of course, no validation yet either. The app is so fast that other machines aren't keeping up. JavaLizards was fast. These new apps are much faster still. The only question now is "will they validate".

If I understand Alexkan correctly, a G5 specific unit might be faster still. Kind of mind boggling. Let's hope these apps can validate. Speed is great, but science comes first.

Imagine, Team MacNN first in Predictor and top 10 in Einstein and SETI. With optimizations like these, it could happen.
 
alexkan Aug 13, 2005 12:27 AM
Quote, Originally Posted by beadman
BOINC tried with seven WU and all failed. I quit BOINC, removed the new SETI, and put the Javalizard version back in.

If you want more info than this, please don't hesitate to ask.

beadman :(
beadman, what version of OS X are you running? I know Apple updated the vDSP libraries with some additional functions when 10.4 came out, and the binaries are compiled with GCC 4, so the only platform that I can guarantee is working is Tiger. I know Rick (the other developer) wrote fallback functions for the new vDSP functions, and there ought to be a way to change the compile options so that older OS X versions can run these compiles as well, but I'm still getting the hang of the BOINC/SETI build system.

Incidentally, I'm not actually a BOINC participant at this time (hence part of the difficulty in verifying the accuracy of the client), so how long does it usually take for results to go through the validation process?
 
Shaktai Aug 13, 2005 05:12 AM
Quote, Originally Posted by alexkan
Incidentally, I'm not actually a BOINC participant at this time (hence part of the difficulty in verifying the accuracy of the client), so how long does it usually take for results to go through the validation process?
The Deadline is 14 days. Work is downloaded in batches for a queue of up to 10 days as indicated by the participant. Those on broadband tend to set small queues. Those on dial-up may set a queue of several days. Each work unit is sent to 4 people. 3 out of the 4 must have results that agree (a quorum) Then the actual points rewarded will be the middle score out of the first three to agree. etc, etc. Of my already completed units, about 70% have received back results from 1 other participant. Looks like we may know something before the weekend is out.

Short answer? From a few hours to several days but usually within 10 days if everyone returns their assigned work.

TIMES UPDATE: On iMac G5 with 10.2.4
7450 app: 7904 sec to 8108 sec (4 units)
7400 app: 6462 sec to 8011 sec. (4 units)
java lizards optimized app: 9795 to 12874 with most over 12,000 sec. As I recall, JavaLizard's app was about 40-50% faster then the unoptimized app.

Results page for my iMac G5. Click in the column for "Work Unit ID" to see how it compares against other work units that are returned.
http://setiathome.berkeley.edu/resul...hostid=1287902

I'll let this run exculsively on SETI for about 24 hours, then my computer will go back to sharing between SETI 25% and Einstein 75& (who has also developed a Mac Optimized App).
 
alexkan Aug 13, 2005 01:47 PM
G5 compiles are available!
OK, I've got a compile that's been made specifically for G5s now. The source for this one is slightly different from the other two compiles, so please let me know if there are any regressions.

My first post should now be edited to include this new compile for those who are new to this thread.
 
Knightrider Aug 13, 2005 03:40 PM
Quote, Originally Posted by alexkan
OK, I've got a compile that's been made specifically for G5s now. The source for this one is slightly different from the other two compiles, so please let me know if there are any regressions.

My first post should now be edited to include this new compile for those who are new to this thread.
Hi Alex,

I downloaded the file ok - but need to know how to run it please. Not an expert on Mac yet.

I am running Tiger (osx 10.4.2) on a G5 dual 2.0 ghz with 2.5 gig ddr sdram

Thanks - K
 
Shaktai Aug 13, 2005 04:10 PM
Quote, Originally Posted by Knightrider
Hi Alex,

I downloaded the file ok - but need to know how to run it please. Not an expert on Mac yet.

I am running Tiger (osx 10.4.2) on a G5 dual 2.0 ghz with 2.5 gig ddr sdram

Thanks - K

Thanks Alexkan. Trying the G5 version now.

Knigthrider installation is easy depending on whether you are using the GUI client or the CLI terminal client.

For the CLI, open your CLI directory folder and look for the projects folder.
Open projects folder
Open SETI@Home folder
Place both the app and the app_info.xml file in the Seti@Home folder. You can leave the current app version there, no need to remove it.

For the GUI clients
Go Hard Drive > Library > Applicatation Support > BOINC Data > Then do just the same as above.
 
beadman Aug 13, 2005 06:08 PM
Quote, Originally Posted by beadman
I tried the new SETI optimization on my PowerBook G4 1 GHz 1 GB RAM, running BOINC 4.44 SuperBench and the new SETI as CLI. ...
Sorry, forgot about the OS. I'm running OS X 10.3.9 on the PowerBook. I also run Einstein and CPDN on that one. Ratios at the time were Einstein/Seti/CPDN 100/100/10, switching every hour, remain in memory, connect time .0000005 days (1/2 second - set this way as E@H .11 ran better).

[edit]
The url for that computer's results is <a href="http://setiathome.berkeley.edu/results.php?hostid=280526"> http://setiathome.berkeley.edu/resul...?hostid=280526. </a>
[/edit]

beadman
 
Shaktai Aug 13, 2005 06:46 PM
Beadmen. Looks like they all got the exact same error messages: Not working with 10.3.x yet then. This should be helpful information.

core_client_version 4.44 /core_client_version
message process got signal 5

stderr_txt
dyld: setiathome-4.18.ppc7450-apple-darwin-vdsp-alpha-1 Undefined symbols:
setiathome-4.18.ppc7450-apple-darwin-vdsp-alpha-1 undefined reference to _statvfs expected to be defined in /usr/lib/libSystem.B.dylib
setiathome-4.18.ppc7450-apple-darwin-vdsp-alpha-1 undefined reference to _vDSP_maxvi expected to be defined in Accelerate
setiathome-4.18.ppc7450-apple-darwin-vdsp-alpha-1 undefined reference to _vDSP_sve expected to be defined in Accelerate
setiathome-4.18.ppc7450-apple-darwin-vdsp-alpha-1 undefined reference to _vDSP_vmma expected to be defined in Accelerate
setiathome-4.18.ppc7450-apple-darwin-vdsp-alpha-1 undefined reference to _vDSP_vsdiv expected to be defined in Accelerate

Well, it didn't work for you this time, but that information should help the developers. That is what they needed.
 
alexkan Aug 13, 2005 07:25 PM
Yep, just as I figured, issues with vDSP backwards compatibility. I'm pretty sure we have drop-in replacements (non-Altivec, unfortunately) for those functions, but I doubt they get compiled in with my current configuration.

The best way to get around this will probably be to compile your own version from the current source code. Our codebase is still a bit messy at this point, so I don't think we're ready to release it publicly. However, those who are interested can PM me for a copy, especially if they're willing to provide 10.3-compatible binaries for other testers.
 
Snake_doctor Aug 13, 2005 09:05 PM
Quote, Originally Posted by alexkan
... However, those who are interested can PM me for a copy, especially if they're willing to provide 10.3-compatible binaries for other testers.
Just sent you a PM on this. I am willing to help if you can help me get started.

Regards
Phil
 
Shaktai Aug 14, 2005 07:51 PM
Well, the SETI validators have been down all weekend. Out of the first 15, 7 have reached quorum but not confirmed yet as valid because the validators on the server side aren't working. Once they get the Validators working at Berkeley again, then we ought to know pretty quick if the client is working well or not.

Still doesn't play nice with Einstein though, but that is a libraries problem that is known and being worked on. (Fixed on the Einstein side, just needs to fixed on the SETI side. None of the Mac SETI apps play nice with Einstein or certain other projects, so it is not unique to your optimized compiles though.

Have to wait a couple more days for results most likely. Average speed over 15 work units is 7422.5 seconds. That puts it around 5,000 seconds faster than JavaLizards app on average. Should make it at least 3 times faster then the stock app on my G5. Sure hope they validate, that would be a very sweet improvement.
 
Snake_doctor Aug 15, 2005 12:08 AM
Just loaded the test app on a 2Ghz G5 Imac. The first WU is not done yet but It looks like it is headed for about 2 hours for a single WU. On my dual 1.5Ghz G4, the MacNN optimized runs in about 5 Hours per WU. The stock SETI app takes a little over 6 hours. On the power book the MacNN version takes about 7 hours but the stock app takes about 8 1/2. So this test cuts the best existing time in about half.

The G5 is only running SETI, so by morning it should produce about 4 or 5 WUs. I like others here hope they validate.

Unfortunatly both G4s I have access to are running 10.3.9 or I would load it on both of them. :brick:

Regards
Phil
 
alexkan Aug 15, 2005 01:50 AM
An update on backwards compatibility
Well, I may have to ask those of you with machines running 10.3.9 and earlier to be a little more patient. Rick has been looking at the build system, and it looks like it may be easier for us to get the worker compiling with Xcode 2.1 first, in order to really do the backwards compatibility thing right. I will still try to send out the source to those that request it, but I want to make sure you all know that I'm not sure the code even compiles the way it is on 10.3.9 the way it is right now.

On an unrelated note, I might be able to speed up the Gaussian-fitting portion of the code a bit, if what I'm investigating can really work. (Of course, this only matters if the current implementation still passes validation.) Stay tuned...
 
Knightrider Aug 15, 2005 03:53 AM
Hi all,

OK. I am running Tiger (osx 10.4.2) on a G5 dual 2.0 ghz with 2.5 gig ddr sdram. BOINC Manager gui v 4.43.

The new compile has settled down and seems to be running fine now, after early worries that the times were not flagging up correctly. it is now consistently doing a wu in 2hrs 14min 45 sec +/- a few seconds.

@ Alex - I sent you a PM with info and have a screen shot available if you want it.

Validation ? sure it will be :cool:

K.
 
Karl Schimanek Aug 15, 2005 11:43 AM
Hi folks!

Here are new results of mine: BOINC SETI Worker compiles provided by javalizard

41,522.55 seconds = 11.5h
41,083.40 seconds = 11.4h

Alpha-Version from Alex:

18,981.84 seconds = 5.25h
18,808.42 seconds = 5.23h
10,650.85 seconds = 2.96h
12,466.55 seconds = 3.5h
18,783.11 seconds = 5.22h

The Alpha-Version is twice as fast :cool:
My config:
Power Mac G4 Quicksilver 733MHz (MPC7450), 256kb L2C, w/o L3C.
640MB RAM, 40GB HD, Mac OS X v10.4.2 (8C46).

i use the BOINC Manager 4.43 with CLI 4.44 superbench client
Measured floating point speed 1082.46 million ops/sec
Measured integer speed 2994.87 million ops/sec

Alex i see your post on the ArsForum. If you need some help, you should ask Hobold and maybe BadAndy.
And here a thread of interest: Altivec Optimizations and other PPC Performance Tips

Great work, thank you very much!
Karl
 
Snake_doctor Aug 15, 2005 04:45 PM
These are the CPU seconds reported for each of the 5 WU turned in so far from the 2Ghz G5 IMac -

10443.78
10924.79
9316.01
11135.24
9133.52

The best times I have seen on the 1.4Ghz G4 Dual (running 10.3.9) with the MacNN app are a least twice these times and usually more like 3 times as long. In the G4 it usually takes about 5 hours to complete 1 WU.

Unfortunately the Validation is currently looking at my outputs from 8/5/05, so it will be some time before I have any validation results. There are about 40 results to validate before it even gets to these. But I have seen no hangs or other bad behavior so far. It is running all by itself so there is not really much trouble it can get into.

the complete results list can be found here - http://setiathome.berkeley.edu/resul...hostid=1297955

Regards
Phil
 
mikkyo Aug 15, 2005 06:04 PM
Quote, Originally Posted by alexkan
Well, I may have to ask those of you with machines running 10.3.9 and earlier to be a little more patient. Rick has been looking at the build system, and it looks like it may be easier for us to get the worker compiling with Xcode 2.1 first, in order to really do the backwards compatibility thing right. I will still try to send out the source to those that request it, but I want to make sure you all know that I'm not sure the code even compiles the way it is on 10.3.9 the way it is right now.

On an unrelated note, I might be able to speed up the Gaussian-fitting portion of the code a bit, if what I'm investigating can really work. (Of course, this only matters if the current implementation still passes validation.) Stay tuned...
In order to run gcc 4.0 compiled binaries you need OS X 10.3.9 or 10.4.x.
In order for the binaries to run on 10.3.9, you need to link with the 10.3.9 SDK version of the dylibs.
That is probably all you need to do to get your workers running on 10.3.9.
 
alexkan Aug 15, 2005 10:44 PM
Quote, Originally Posted by mikkyo
In order to run gcc 4.0 compiled binaries you need OS X 10.3.9 or 10.4.x.
In order for the binaries to run on 10.3.9, you need to link with the 10.3.9 SDK version of the dylibs.
That is probably all you need to do to get your workers running on 10.3.9.
Well, Rick's sent me an Xcode project that I can use to compile binaries linked against the 10.3.9 SDK. I'll try to get some compiles up once Xcode 2.1 finishes downloading. Be warned, however, that the 10.3.9 compiles will be slower, since we probably can't write vDSP replacement functions that are as well-tweaked as the originals. :p

Shaktai, it seems like your times are a bit faster than the other G5 users' times, even though you're running a slightly slower processor. Is there something in particular that you're doing that the others might not have done?
 
Shaktai Aug 15, 2005 11:34 PM
Quote, Originally Posted by alexkan
Shaktai, it seems like your times are a bit faster than the other G5 users' times, even though you're running a slightly slower processor. Is there something in particular that you're doing that the others might not have done?
Could just be the luck of the draw, or maybe I'm not using it for other things at the same time as much. Here is a breakdown of high and low for each client though, if that helps. Might have just gotten some noisy work units.

7450 client - 4 units - 7904 to 8108 -- 8051 average.
7400 client - 4 units - 6462 to 8011 -- 7312 average
G5 client - 12 units - 4838 to 9430 with one oddball that was only 1699.
---- 6907 average if we drop out the oddball.

Guess we'll have to wait for validation to see if there is anything weird going on. The selection for he G4 clients is really too small to get a good idea. Need about 24 of each to get a good sample in order to take the noisy units out of the equation. Maybe once they get validation up and working, I'll give each app another try if we know they validate.
 
Snake_doctor Aug 16, 2005 10:30 AM
Quote, Originally Posted by Shaktai
...Need about 24 of each to get a good sample in order to take the noisy units out of the equation...
Shaktai,

Here are 14 more from a 2 Ghz G5 iMac running OS 10.4.2 to analyze.

http://setiathome.berkeley.edu/resul...hostid=1297955

I should be able to add about 6-8 more to this list by tonight. We got this computer as a gift for someone and we are burning it in and installing software, so the credits are not really an issue. We have been running this S@H beta for about 2 days full time on this system. To be fair to the system we are also doing some other things at the same time which may eschew the times a bit, but they would represent the times that a normal user might see on a multi use system.

We are adding a new reported WU to this list about every 3 hours or so.

Regards
Phil
 
Welnic Aug 16, 2005 10:50 AM
When the time listed is cpu time, then it doesn't matter if you were doing anything else with the computer. If it says 3600 seconds that means that the computer worked on the wu for an hour. If it wasn't doing other things that would take about 1 hour and 30 seconds in real time. If it is a one processor machine doing other stuff then it could take many hours to actually do the job, but the cpu time would be the same.
 
alexkan Aug 16, 2005 11:24 AM
OK, 10.3.9-compatible client is available (but not really tested) . It's optimized for G4 in general (no specific mtune for either of the two types). Let me know what you guys get.

Also, if you're using the G5, I realize that I totally screwed up the prefetching. I figure prefetching shouldn't affect the results, but just to be on the safe side, use the 7400 version for now until I fix it. (Man, I wish I had my own G5 to test my code on.)
 
Snake_doctor Aug 16, 2005 12:02 PM
Quote, Originally Posted by alexkan
OK, 10.3.9-compatible client is available (but not really tested) . It's optimized for G4 in general (no specific mtune for either of the two types). Let me know what you guys get.
Thanks Alex, I'll load it up when my queue empties in about an hour and take it for a spin 'round the block on the PowerBook G4 first. If it makes it through the first WU ok then I will try it on the G4 Dual.

The Validators are only progressing about 2 WUs per day through my list so it looks like it may be a few weeks till we have validation results.

Thank you for your efforts

Regards
Phil
 
Snake_doctor Aug 16, 2005 01:45 PM
Quote, Originally Posted by Welnic
When the time listed is cpu time, then it doesn't matter if you were doing anything else with the computer.
You are of course correct. That was pretty stupid of me. :lol:
 
Snake_doctor Aug 16, 2005 03:04 PM
Well the 10.3.9 app is running on the PowerBook. it seems to be working ok, We should have a result in about 3 hours.

Regards
Phil
 
Shaktai Aug 16, 2005 07:51 PM
Quote, Originally Posted by alexkan
Also, if you're using the G5, I realize that I totally screwed up the prefetching. I figure prefetching shouldn't affect the results, but just to be on the safe side, use the 7400 version for now until I fix it. (Man, I wish I had my own G5 to test my code on.)
Well, I might just have to do that. It will be interesting to see what the averate time is across a larger sample.
 
Snake_doctor Aug 16, 2005 10:01 PM
Ok, the first result is finished on the PowerBook running 10.3.9. Either you guys have performed some real magic or this was a small WU. The time went from over 5 hours to 3 hours. This is OUTSTANDING if it holds true over the average. The G4 Dual has been converted to the new APP (also running 10.3.9) but as yet it has not received any WUs. Right now it is working off some debt on other projects. I would actually expect it to be about 1/3 faster than the powerbook, but we will see. :thumbsup:

The Powerbook WU is -16my04ab.4737.112.822138.250_2
The result ID is - 101829586
Computer number - 1200967
time - 11,054.97

http://setiathome.berkeley.edu/resul...ltid=101829586

When it validates (in about 9 months) i'll let you know the outcome.

Regards
Phil
 
Snake_doctor Aug 17, 2005 07:24 AM
Quote, Originally Posted by alexkan
OK, 10.3.9-compatible client is available (but not really tested) . It's optimized for G4 in general (no specific mtune for either of the two types). Let me know what you guys get.

Also, if you're using the G5, I realize that I totally screwed up the prefetching. I figure prefetching shouldn't affect the results, but just to be on the safe side, use the 7400 version for now until I fix it. (Man, I wish I had my own G5 to test my code on.)
Just something else to think about. Bruce Allen one of the guys working on BOINC and Einstein@Home says that there are some issues with projects that have not been compiled using the very latest version of the BOINC API library (August 12, 2005 or later). I do not know where you can get this but I do know that there have been some recent changes to the API and that after the changes newly compiled APPs became a lot more friendly, and were able to work and play well with others. If you are going to be recompiling anyway, you might want to try to get the new API. A number of the clashes between S@H and other projects under BOINC could go away if you can do this.

Just a thought.

Regards
Phil
 
alexkan Aug 17, 2005 06:50 PM
Well, I'm not sure how much recompiling there's going to be for now. Rick is away until early next week, and since the SETI validation results aren't back yet, I'm a bit apprehensive about releasing any other optimizations for public testing until I'm sure that we haven't gone too far already, not to mention that it would be nice if at the end of the day, there was still some way to get this code folded back into the official SETI client so that other Mac users can benefit from these optimizations.

I might bring back the G5 version in the next couple of days, though, perhaps with all the prefetching instructions removed entirely. I imagine that the G5's prefetch engine shouldn't have any problem picking up our data access pattern, especially if it doesn't have any misleading (or just plain wrong) prefetch hints getting in its way.
 
Snake_doctor Aug 17, 2005 11:04 PM
Well what you have done so far is very fast by comparison to the release version. I wish the validators were moving faster so we could get an answer. Still good work on your part.

Regards
Phil
 
xplatform Aug 17, 2005 11:49 PM
success
http://setiathome.berkeley.edu/resul...hostid=1306287

Note that result 102055889 (the earliest one) I did with the standard-issue S@H application, for comparison purposes

Results after that (above it in the Web listing) are using the alpha-test from this page. The optimized version seems to be averaging ~11,000 seconds, about 60% of the CPU time of the non-optimized version.

Machine info:
http://setiathome.berkeley.edu/show_...hostid=1306287
(Mac mini)

Using standard-issue BOINC 4.43 GUI. This machine is set to "Run based on preferences" and is attached to Einstein@H (beta-test 0.12 application) and S@H.
 
Todd Madson Aug 18, 2005 04:19 PM
Wow...
Okay folks, I'm running a G5 dual 2.5 ghz machine and have been running Seti Classic
forever now.

I want to test this out.

Is there any chance I can create an account without killing my old seti classic
account since I'm still working on that particular challenge (I want to run it until
the end on at least a few of my machines).

How do I go about setting up boinc for a CLI interface window to see the performance?

I've not run the boinc client or even know how to set it up versus the previous cli version.

Thanks.
 
Shaktai Aug 18, 2005 09:10 PM
Quote, Originally Posted by Todd Madson
Is there any chance I can create an account without killing my old seti classic
account since I'm still working on that particular challenge (I want to run it until
the end on at least a few of my machines).
.
Absolutely you can have a BOINC account without killing your SETI Classic account. Some time back you probably got a key by e-mail. If so you can use that and it will link to your SETI stats as of May, and your final stats when they close down classic. They will show on your account page, but it won't stop you from running classic.

If you don't have that key, then you will probably have to set-up a new account, but can't be sure it it will link. Make sure you use the same e-mail for both, and it might. Since you have a dualie, you could actually run an instance of each on the same computer but make sure you have your BOINC SETI account setup to only run one instance (1 cpu) on your dualie. You can run CLI or GUI, whichever suits your fancy. Right now we're all waiting for SETI BOINC to fix the validation problem so that folks get credit, but as long as the work is in on time, you'll eventually get credit. First thing is to get an active account, download the version of BOINC you want (CLI or GUI) from the SETI BOINC download page. Let it run benchmarks then download mikkyo's optimized clients. Make sure your cache is set small. Attach to SETI but before doing that make sure your cache is set very small. Maybe .1 or .2 days. Once you are attached, quit and then install the optimized SETI app. You'll have to re-run your benchmarks (which it might do automatically) From then on, you will be good to go. Setup your account first and then if you have problems we can help you through it.

Once you're setup, you can always add more computers later. Just don't lose the key.
 
Snake_doctor Aug 20, 2005 10:29 AM
Well it looks like you may have some sucess. I have my first Validated result from the latest version of your software for the G5. All of the results from this machine have been processed using your app.

http://setiathome.berkeley.edu/resul...ltid=103164679

Still waiting for the first one from the OS 10.3.9 version.

Good work guys.

Regards
Phil
 
Shaktai Aug 20, 2005 02:59 PM
Ditto. Here as well, my first validated unit is from the G5 app. 38 others are still pending, but it looks like we'll start to see more results soon.

FYI: I went back to the G4 7400 client at your suggestion, and it is averaging much faster times then G5 client was on my G5 iMac. With a sample of 19 completed work units (much larger then the previous 4) I am averaging 5810.3 seconds per work unit or about 1.6 hours. That is incredible for a 1.6 ghz G5 iMac. I hope that those units validate. If so, that would mean more then a 2x improvement over JavaLizards optimized app and 3-4x faster then the stock app. You have succeeded in pulling every bit of the altivec potential out of these clients.

I don't know how you could pull more than that out of it, but if you think you can with the G5 go for it if more units validate.
 
Snake_doctor Aug 20, 2005 03:33 PM
Ok so I know this might be counter productive to processing speed, but I just have to ask. Is there a possibility that you might put the display code back in the app so folks that like that sort of thing could use it? I would think that as long as the "show Graphics" option is not turned on it would not slow things down much.

Just hoping

Phil
:D
 
Bill Michael Aug 22, 2005 03:37 PM
Results on Mac Mini
Small number of results done as I don't want to risk validation problems, but on my 1.25GHz Mac Mini, my times have dropped from an average of 21,000 seconds to an average of 11,000 seconds. My first WU under the optimized client was 7,500 and I almost had a heart attack... but that was just a "shorter" WU, darn it. Still, almost a 50% cut in CPU time is VERY impressive!

Bill
 
Knightrider Aug 23, 2005 03:49 AM
First validated result:-

http://setiathome.berkeley.edu/resul...ltid=102802605


I am running Tiger (osx 10.4.2) on a G5 dual 2.0 ghz with 2.5 gig ddr sdram. BOINC Manager gui v 4.43.

K.
 
alexkan Aug 23, 2005 04:11 AM
It's great to hear that you guys are starting to get validated results back! :cool:

Anyways, I've got some new optimizations that seem to be working, in the sense that they don't crash the worker at this point. I'll be doing a test run overnight to verify that they haven't changed the results, so if I'm lucky, you'll see new compiles up in about a day. (I might post the 7400 compile sooner, since that's what I'm testing with.) If the profiling data I have is correct, the vDSP libraries are starting to become the bottleneck of the worker (we spend a majority of the time performing the FFT), which suggests that there aren't going to be any more huge jumps in speed anymore.

I'll update this post when the test run finishes and as other developments occur, so keep watching this thread...

Update!
The new client finished its overnight run successfully and came up with the same results as the clients that I've been posting, only this time it only took 13700 seconds (~3:50) to finish the reference work unit. That makes it about another 50% faster than the previous version, at least on my machine. (YMMV depending on CPU architecture, of course.) I'm going to update all the links in my first post later, I'll also put the new links here.

G4 7400/7410 G4 7450/7455 G5

These links might need about an hour to all go live (the 7400 link is live at the time of this writing, but I'm still compiling the others). 10.3.9 compiles will have to wait until these changes find their way into the Xcode project that I've been building your compiles from.

Happy crunching!
 
alexkan Aug 23, 2005 11:29 AM
Also, another thing for those of you who are reporting back crunching times--could you check and see what your processor performance is set to in Energy Saver?
 
Shaktai Aug 23, 2005 07:18 PM
Quote, Originally Posted by alexkan
Also, another thing for those of you who are reporting back crunching times--could you check and see what your processor performance is set to in Energy Saver?
Mine is set to "highest". Just added the new G5 app to give it a try as well. Looks like I'll run out of work before the night is out though, since upload & download is suspended for the evening so it will be a couple of days to get a large enough sample to report accurate times.

With the older clients, all versions, and a 68 unit sample I have averaged 1 hour 42 minutes 33 seconds across all 68 units, most of which were crunched with the 7400 version on an iMac G5 1.6 ghz.

Here is a great online utility for users to check their pending credit and average work unit rate: http://ztb.dyndns.org/pc.php
 
Snake_doctor Aug 23, 2005 09:02 PM
Quote, Originally Posted by alexkan
Also, another thing for those of you who are reporting back crunching times--could you check and see what your processor performance is set to in Energy Saver?
Only one of my systems is a powerbook. It is set for performance.

Regards
Phil
 
Knightrider Aug 25, 2005 01:47 AM
Quote, Originally Posted by alexkan
Also, another thing for those of you who are reporting back crunching times--could you check and see what your processor performance is set to in Energy Saver?
My G5 Tiger is set to Automatic. I will set it to Highest. I d/l the new version and I have 4 wu's remining.

last wu received 23 Aug 2005 16:34:33 UTC
pending C 3,172.78
granted C 67,022.49
total C 70,195.27
estimated GC 4,054.44
estimated TC 71,076.93


# wus 226
Ø seconds/wu 6,707.01
Ø time/wu 1h 51m 47.011s
Ø CC/wu 14.04
Ø GC/wu 17.94


K.
 
All times are GMT -4. The time now is 02:16 PM.

Copyright © 2005-2007 MacNN. All rights reserved.
Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2014, vBulletin Solutions, Inc.


Content Relevant URLs by vBSEO 3.3.2