Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Community > Team MacNN > New Altivec-enhanced Seti worker in need of testing

New Altivec-enhanced Seti worker in need of testing (Page 13)
Thread Tools
alexkan  (op)
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Jan 26, 2006, 02:06 AM
 
Originally Posted by Gecko-r7
Hi Alex, I'll give it a run on my G4 dual 1.33 tomorrow and post the results. Glad to help. BTW, you see this? http://fftw.org/release-notes.html Thought this might interest you.
Greetings from San Diego.
Yes, that's my reason for bringing this subject up. We've examined FFTW in the past, but from what I can tell, the 3.1 betas are a huge improvement on 3.0.1 on PowerPC. Still not quite as fast on smaller FFT sizes as vDSP, but much closer than before.
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 26, 2006, 03:17 AM
 
Top-6 all Quads! That's three Hail Steves for everyone...

Actually, the Hails should go to Alex and Rick! Excellent job, guys!

Alex, I'll be running your test on as many machines as I can over the next few days. As far as CPU info is concerned, would the first few lines of the system_profiler output suit your needs? Here's a sample from my DP 2.5 GHz G5:

Hardware Overview:

Machine Name: Power Mac G5
Machine Model: PowerMac7,3
CPU Type: PowerPC G5 (3.0)
Number Of CPUs: 2
CPU Speed: 2.5 GHz
L2 Cache (per CPU): 512 KB
Memory: 5 GB
Bus Speed: 1.25 GHz
Boot ROM Version: 5.1.8f7
( Last edited by halimedia; Jan 26, 2006 at 08:08 AM. )
     
alexkan  (op)
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Jan 26, 2006, 03:27 AM
 
Originally Posted by halimedia
Alex, I'll be running your test on as many machines as I can over the next few days. As far as CPU info is concerned, would the first few lines of the system_profiler output suit your needs? Here's a sample from my DP 2.5 GHz G5:
The information I need most is CPU type, clock speed, L2 cache size, FSB speed, and RAM type. The output from system_profiler ought to be sufficient, since we can probably deduce the RAM type from all the other data.

Still hoping that we can get our work unit times slightly more competitive, especially on older G5s. In any case, it looks like throughput from quad G5s is fantastic!
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 26, 2006, 03:36 AM
 
How about the 'Hardware Overview' (w/o S/N) and the 'Memory' sections from system_profiler?

Edit: changed info to just what Alex needs (see his post below)

Machine Name: Power Mac G5
Machine Model: PowerMac7,3
CPU Type: PowerPC G5 (3.0)
Number Of CPUs: 2
CPU Speed: 2.5 GHz
L2 Cache (per CPU): 512 KB
Memory: 5 GB (PC3200U)
Bus Speed: 1.25 GHz
Boot ROM Version: 5.1.8f7
( Last edited by halimedia; Jan 26, 2006 at 08:08 AM. )
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 26, 2006, 03:57 AM
 
OK, here's the first one. The aforementioned PM G5 DP 2.5 GHz.

fft_test:
16384 7340.032000
32768 3145.728000
65536 1864.135111
131072 1371.214769
FFTW3 (this may take a minute)
16384 4893.354667
32768 3932.160000
65536 2097.152000
131072 2228.224000

fft_test2:
Apple vDSP op / ip
1024 11184.810667 10652.200635
2048 11184.810667 11017.873194
4096 10458.524260 9820.809366
8192 5975.446795 5739.573895
16384 5908.956579 5872.025600
32768 2267.191351 4075.437085
65536 1114.996702 1917.396114
131072 1127.322814 1159.401106
weighted time 12.830000 11.420000 10.120000
FFTW3 interleaved op / ip (this may take a minute)
1024 11001.453115 8494.792911
2048 10397.147944 5678.442338
4096 9702.486361 6391.320381
8192 8003.809468 5592.405333
16384 6222.013881 4290.064365
32768 4211.853389 3660.483491
65536 2412.902975 2386.092942
131072 2044.535283 1956.862244
weighted time 6.870000 7.160000

Machine Name: Power Mac G5
Machine Model: PowerMac7,3
CPU Type: PowerPC G5 (3.0)
Number Of CPUs: 2
CPU Speed: 2.5 GHz
L2 Cache (per CPU): 512 KB
Memory: 5 GB (PC3200U-30330)
Bus Speed: 1.25 GHz
Boot ROM Version: 5.1.8f7
( Last edited by halimedia; Jan 30, 2006 at 08:39 AM. )
     
Drash
Fresh-Faced Recruit
Join Date: Aug 2001
Location: UK
Status: Offline
Jan 26, 2006, 04:12 AM
 
Apple Altivec DSP
8 629.145600
16 1048.576000
32 2621.440000
64 3145.728000
128 3670.016000
256 8388.608000
512 4718.592000
1024 5242.880000
2048 5767.168000
4096 4194.304000
8192 1947.355429
16384 1631.118222
32768 827.823158
65536 559.240533
131072 509.308343
262144 438.938791
FFTW3 (this may take a minute)
8 629.145600
16 838.860800
32 873.813333
64 1258.291200
128 2446.677333
256 2796.202667
512 3145.728000
1024 2621.440000
2048 2306.867200
4096 2097.152000
8192 1947.355429
16384 1468.006400
32768 748.982857
65536 621.378370
131072 540.175515
262144 555.128471

then for fft_test2

Apple vDSP op / ip
1024 5412.005161 4660.337778
2048 4954.345664 4420.344335
4096 4151.063753 3745.611014
8192 1876.161789 1744.830464
16384 803.698970 1047.407019
32768 550.373406 591.788924
65536 476.794771 528.936859
131072 361.715500 499.059794
weighted time 38.510000 29.160000 31.540000
FFTW3 interleaved op / ip (this may take a minute)
1024 4092.003902 3064.331689
2048 3417.581037 1947.750670
4096 3300.435934 1945.184464
8192 2416.662693 1694.010159
16384 1126.527693 918.400876
32768 751.218627 645.277538
65536 655.920479 545.323425
131072 582.066678 537.123676
weighted time 24.670000 27.320000

Machine Name: eMac
Machine Model: PowerMac4,4
CPU Type: PowerPC G4 (3.3)
Number Of CPUs: 1
CPU Speed: 1 GHz
L2 Cache (per CPU): 256 KB
Memory: 1 GB
Bus Speed: 133 MHz
Boot ROM Version: 4.6.4f1

DIMM0/J1600:

Size: 512 MB
Type: SDRAM
Speed: PC133U-322
Status: OK

DIMM1/J1601:

Size: 512 MB
Type: SDRAM
Speed: PC133U-322
Status: OK
( Last edited by Drash; Jan 31, 2006 at 01:17 AM. )
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 26, 2006, 04:13 AM
 
Here are the results from my Quad:

fft_test:
Apple Altivec DSP
16384 4893.354667
32768 5242.880000
65536 5592.405333
131072 2970.965333
FFTW3 (this may take a minute)
16384 4893.354667
32768 3932.160000
65536 4194.304000
131072 2970.965333

fft_test2:
Apple vDSP op / ip
1024 11184.810667 10324.440615
2048 11184.810667 10698.514551
4096 10066.329600 9702.486361
8192 5975.446795 5777.584318
16384 6263.493973 5984.229911
32768 5687.191864 5592.405333
65536 2339.306806 4948.119005
131072 1457.025144 2859.274907
weighted time 9.110000 4.670000 7.830000
FFTW3 interleaved op / ip (this may take a minute)
1024 11001.453115 8388.608000
2048 10397.147944 5550.357173
4096 9702.486361 6291.456000
8192 8077.918815 5521.615392
16384 6391.320381 5024.193027
32768 5655.241348 4725.976338
65536 4511.520269 3918.765781
131072 3108.584981 2970.965333
weighted time 4.400000 4.690000

Machine Name: Power Mac G5 Quad
Machine Model: PowerMac11,2
CPU Type: PowerPC G5 (1.1)
Number Of CPUs: 4
CPU Speed: 2.5 GHz
L2 Cache (per CPU): 1 MB
Memory: 4 GB (PC2-4200E-444)
Bus Speed: 1.25 GHz
Boot ROM Version: 5.2.7f1
( Last edited by halimedia; Jan 30, 2006 at 11:13 AM. )
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 26, 2006, 04:28 AM
 
And here's another one - from an Xserve G5 DP 2.0 GHz (running OSXS 10.3.9):

fft_test:
Apple Altivec DSP
16384 4893.354667
32768 5242.880000
65536 2097.152000
131072 1273.270857
FFTW3 (this may take a minute)
16384 3670.016000
32768 3145.728000
65536 2396.745143
131072 1782.579200

fft_test2:
Apple vDSP op / ip
1024 8947.848533 7374.600440
2048 9002.408585 8388.608000
4096 8053.063680 6547.206244
8192 4428.503716 4473.924267
16384 4697.620480 4721.226613
32768 1546.287189 2319.430783
65536 1054.756212 1497.547872
131072 1011.392454 1113.025061
weighted time 14.270000 12.350000 11.280000
FFTW3 interleaved op / ip (this may take a minute)
1024 9068.765405 6778.673131
2048 8294.353978 4528.819043
4096 7255.012324 5129.339924
8192 6058.439111 4134.669346
16384 4538.763749 3641.566264
32768 2720.629622 2467.237647
65536 2068.866713 2037.460767
131072 1933.645234 1980.643556
weighted time 7.460000 7.380000

Machine Model: Xserve G5
CPU Type: PowerPC G5 (3.0)
Number Of CPUs: 2
CPU Speed: 2 GHz
L2 Cache (per CPU): 512 KB
Memory: 3 GB (PC3200U-30330 and -30440 mixed)
Bus Speed: 1 GHz
Boot ROM Version: 5.1.7f1
( Last edited by halimedia; Jan 30, 2006 at 10:58 AM. )
     
alexkan  (op)
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Jan 26, 2006, 04:52 AM
 
Wow, I seem to have unleashed massively long posts upon this thread. Perhaps asking you to paste output wasn't such a good idea, after all. New idea: the only lines from the output we really care about (since they're the only lines that have much bearing on computation time) are the lines that start with 16384, 32768, 65536, and 131072 that go either with Apple Altivec DSP and FFTW3. For that matter, I think that much memory information might be overkill--all I need to know is whether we're dealing with PC3200 or PC4200.

And just so you know what you're posting, the first number on each line is an FFT length, and the second number is a calculated number of megaflops. Before you get all excited, keep in mind that these numbers are based on how many calculations are required (in theory) to compute an FFT and are obtained from timing data. We're just looking to see how much bigger the FFTW3 numbers are than the Altivec DSP numbers, but you should be able to use these numbers to directly compare different Macs.

I am excited, however, to see some of those numbers. While it looks like dual-core G5 owners won't gain from a switch to FFTW after all, older G5s will certainly receive a sizable speed increase. This could wind up knocking at least 10 minutes off your work unit times. G4s might get an observable speed increase, but nothing as crazy as older G5s.
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 26, 2006, 06:05 AM
 
Results from an iMac G4 FP 17" 1.0 GHz:

Apple Altivec DSP
16384 1468.006400
32768 683.853913
65536 479.349029
131072 457.071590
FFTW3 (this may take a minute)
16384 1334.551273
32768 582.542222
65536 578.524690
131072 495.160889

Machine Name: iMac
Machine Model: PowerMac6,1
CPU Type: PowerPC G4 (3.3)
Number Of CPUs: 1
CPU Speed: 1 GHz
L2 Cache (per CPU): 256 KB
Memory: 1 GB (PC2100)
Bus Speed: 133 MHz
Boot ROM Version: 4.5.9f1
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 26, 2006, 07:16 AM
 
Power Mac G4 Digital Audio w/ GigaDesigns DP 1.33 GHz upgrade:

fft_test:
Apple Altivec DSP
16384 2446.677333
32768 1429.876364
65536 1198.372571
131072 990.321778
FFTW3 (this may take a minute)
16384 2097.152000
32768 1572.864000
65536 1198.372571
131072 891.289600

fft_test2:
Apple vDSP op / ip
1024 7216.006882 6213.783704
2048 6650.427964 5905.580032
4096 5515.797041 5033.164800
8192 2436.914056 2295.829558
16384 1723.897424 1965.531582
32768 1218.683971 1292.211759
65536 1079.137512 1306.255260
131072 512.741882 1107.622027
weighted time 25.350000 12.930000 22.250000
FFTW3 interleaved op / ip (this may take a minute)
1024 5412.005161 4018.494850
2048 4556.774716 2599.286986
4096 4424.760264 2597.762477
8192 3912.175928 2289.803759
16384 2247.665301 1657.008988
32768 1597.830095 1242.756741
65536 1335.499781 1049.601001
131072 717.516156 813.730876
weighted time 18.380000 17.190000

Machine Name: Power Mac G4
Machine Model: PowerMac3,4
CPU Type: PowerPC G4 (3.3)
Number Of CPUs: 2
CPU Speed: 1.33 GHz
L2 Cache (per CPU): 256 KB
L3 Cache (per CPU): 2 MB
Memory: 1.5 GB (PC133-333)
Bus Speed: 133 MHz
Boot ROM Version: 4.2.8f1


Power Mac G4 Digital Audio DP 533 MHz:

fft_test:
Apple Altivec DSP
16384 917.504000
32768 873.813333
65536 762.600727
131072 614.682483
FFTW3 (this may take a minute)
16384 734.003200
32768 582.542222
65536 479.349029
131072 363.791673

fft_test2:
Apple vDSP op / ip
1024 2739.137306 2431.480580
2048 2145.922977 2028.015121
4096 1494.074894 1331.084906
8192 971.509167 849.479291
16384 1011.328413 892.235609
32768 752.341525 729.444174
65536 530.767090 691.398470
131072 361.142984 636.635429
weighted time 37.670000 22.870000 31.590000
FFTW3 interleaved op / ip (this may take a minute)
1024 1853.836022 1392.300083
2048 1379.808419 912.481464
4096 1244.677539 801.299869
8192 1060.042809 615.243464
16384 1072.516091 643.509655
32768 848.047987 558.000532
65536 614.268778 471.352864
131072 428.730059 413.501518
weighted time 31.890000 34.700000

Machine Name: Power Mac G4
Machine Model: PowerMac3,4
CPU Type: PowerPC G4 (11.3)
Number Of CPUs: 2
CPU Speed: 533 MHz
L2 Cache (per CPU): 1 MB
Memory: 768 MB (PC133-333)
Bus Speed: 133 MHz
Boot ROM Version: 4.2.8f1


Mac mini 1.42 GHz:

fft_test:
Apple Altivec DSP
16384 3670.016000
32768 1966.080000
65536 838.860800
131072 614.682483
FFTW3 (this may take a minute)
16384 2936.012800
32768 1572.864000
65536 986.895059
131072 685.607385

fft_test2:
Apple vDSP op / ip
1024 7713.662529 6710.886400
2048 7030.452419 6255.911051
4096 5921.370353 5298.068211
8192 2787.269112 2558.402440
16384 2821.393682 2646.546749
32768 932.067556 1202.667814
65536 615.677651 800.702330
131072 468.521843 614.020822
weighted time 29.450000 22.520000 24.350000
FFTW3 interleaved op / ip (this may take a minute)
1024 5687.191864 4301.850256
2048 4824.820288 2764.784659
4096 4654.950104 2757.898521
8192 4362.076160 2436.914056
16384 4157.186265 2366.559436
32768 1360.314811 1158.380852
65536 746.172220 711.558531
131072 663.285284 665.607169
weighted time 21.300000 21.530000

Machine Name: Mac mini
Machine Model: PowerMac10,1
CPU Type: PowerPC G4 (1.1)
Number Of CPUs: 1
CPU Speed: 1.42 GHz
L2 Cache (per CPU): 512 KB
Memory: 1 GB (PC2700U-25330)
Bus Speed: 167 MHz
Boot ROM Version: 4.8.9f1
( Last edited by halimedia; Jan 30, 2006 at 05:26 PM. )
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 26, 2006, 07:43 AM
 
Xserve G5 2.0 GHz:

fft_test:
Apple Altivec DSP
16384 7340.032000
32768 5242.880000
65536 1525.201455
131072 1188.386133
FFTW3 (this may take a minute)
16384 3670.016000
32768 3145.728000
65536 1864.135111
131072 1980.643556

fft_test2:
Apple vDSP op / ip
1024 9192.995068 8494.792911
2048 9002.408585 8684.676518
4096 7743.330462 6824.630237
8192 4615.953608 4406.137535
16384 4538.763749 4583.044371
32768 1486.902452 3236.761929
65536 1056.832504 1629.350264
131072 1026.868306 1129.555137
weighted time 14.110000 11.980000 11.110000
FFTW3 interleaved op / ip (this may take a minute)
1024 8830.113684 6847.843265
2048 8294.353978 4556.774716
4096 7669.584457 5033.164800
8192 6100.805818 4115.166189
16384 4560.796583 3627.506162
32768 2851.651445 2535.599395
65536 2143.197253 1962.964943
131072 2037.233371 1930.373415
weighted time 7.110000 7.570000

Machine Name: Xserve G5
Machine Model: RackMac3,1
CPU Type: PowerPC G5 (3.0)
Number Of CPUs: 1
CPU Speed: 2 GHz
L2 Cache (per CPU): 512 KB
Memory: 1 GB (PC3200U-30330)
Bus Speed: 1 GHz
Boot ROM Version: 5.1.7f2


Mac mini 1.25 GHz:

fft_test:
Apple Altivec DSP
16384 2936.012800
32768 1747.626667
65536 883.011368
131072 594.193067
FFTW3 (this may take a minute)
16384 2446.677333
32768 1572.864000
65536 883.011368
131072 685.607385

fft_test2:
Apple vDSP op / ip
1024 6778.673131 5886.742456
2048 6203.340370 5508.936597
4096 5229.262130 4765.126438
8192 2122.664798 2052.741722
16384 2427.710842 2308.413012
32768 909.334201 1151.753959
65536 602.548723 789.516047
131072 446.691734 632.751352
weighted time 30.760000 22.050000 25.540000
FFTW3 interleaved op / ip (this may take a minute)
1024 5045.779248 3834.792229
2048 4267.037595 2444.362596
4096 4151.063753 2425.621590
8192 3843.238907 2148.805990
16384 3001.674428 1945.184464
32768 1329.766129 1100.145311
65536 852.852918 798.321059
131072 700.768236 691.843959
weighted time 19.960000 20.510000

Machine Name: Mac mini
Machine Model: PowerMac10,1
CPU Type: PowerPC G4 (1.2)
Number Of CPUs: 1
CPU Speed: 1.25 GHz
L2 Cache (per CPU): 512 KB
Memory: 1 GB (PC2700U-25330)
Bus Speed: 167 MHz
Boot ROM Version: 4.8.9f1


PowerBook G4 17" 1.5 GHz:

fft_test:
Apple Altivec DSP
16384 3670.016000
32768 1966.080000
65536 883.011368
131072 636.635429
FFTW3 (this may take a minute)
16384 2936.012800
32768 1966.080000
65536 1048.576000
131072 713.031680

fft_test2:
Apple vDSP op / ip
1024 8285.044938 7139.240851
2048 7381.975040 6650.427964
4096 6242.685023 5631.513063
8192 2957.339769 2717.804461
16384 3308.183437 3070.340183
32768 907.694283 1233.618824
65536 586.744166 789.516047
131072 479.349029 620.027548
weighted time 29.120000 22.370000 23.800000
FFTW3 interleaved op / ip (this may take a minute)
1024 6045.843604 4628.197517
2048 5126.371556 2929.355175
4096 4971.026963 2928.386793
8192 4640.506553 2581.110154
16384 3834.792229 2492.106355
32768 1489.102012 1182.882444
65536 751.393859 737.460044
131072 656.793718 643.095089
weighted time 21.410000 21.980000

Machine Name: PowerBook G4 17"
Machine Model: PowerBook5,5
CPU Type: PowerPC G4 (1.1)
Number Of CPUs: 1
CPU Speed: 1.5 GHz
L2 Cache (per CPU): 512 KB
Memory: 1.5 GB (PC2700U-25330)
Bus Speed: 167 MHz
Boot ROM Version: 4.8.4f1
( Last edited by halimedia; Jan 30, 2006 at 04:25 PM. )
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 26, 2006, 07:59 AM
 
PowerBook G3 Pismo w/ WM 550 MHz G4 upgrade:

fft_test:
Apple Altivec DSP
16384 815.559111
32768 786.432000
65536 621.378370
131072 481.778162
FFTW3 (this may take a minute)
16384 667.275636
32768 542.366897
65536 430.185026
131072 349.525333

fft_test2:
Apple vDSP op / ip
1024 2728.002602 2440.322327
2048 2103.126792 1932.454199
4096 1344.417977 1246.604285
8192 910.663081 790.947626
16384 905.129187 794.187740
32768 682.926024 675.139477
65536 490.517051 569.624310
131072 302.212103 510.447735
weighted time 44.350000 28.220000 37.750000
FFTW3 interleaved op / ip (this may take a minute)
1024 1890.390535 1380.840823
2048 1352.010081 894.784853
4096 1145.528262 756.156214
8192 997.045979 600.836937
16384 967.584033 598.804395
32768 779.127678 491.040468
65536 572.357049 388.052701
131072 352.223121 361.371773
weighted time 38.080000 40.080000

Machine Name: PowerBook
Machine Model: PowerBook3,1
CPU Type: PowerPC G4 (11.4)
Number Of CPUs: 1
CPU Speed: 550 MHz
L2 Cache (per CPU): 1 MB
Memory: 768 MB (PC100-222S)
Bus Speed: 100 MHz
Boot ROM Version: 4.1.8f5


Just for kicks, I also tried to run fft_test on a Pismo with the original 400 MHz G3 processor. Here's the output:

dyld: incompatible cpu-subtype
Trace/BPT trap

OK, these are all the different altivec machine types I have to offer. All are running Tiger/Tiger Server unless otherwise noted. Processor Performance was set to Highest and Nap was enabled where CPUs support these features. No resource-hogging processes were active at the time of running fft_test.

HTH,

Ron
( Last edited by halimedia; Jan 30, 2006 at 05:00 PM. )
     
virex
Junior Member
Join Date: Feb 2001
Location: Reading
Status: Offline
Jan 26, 2006, 09:18 AM
 
heres my quad, what's the command to findout the type of ram? i only have ssh access or i'd use the gui...

Apple Altivec DSP
16384 4893.354667
32768 7864.320000
65536 4194.304000
131072 1980.643556
262144 1110.256941
FFTW3 (this may take a minute)
16384 4893.354667
32768 3932.160000
65536 2396.745143
131072 2228.224000
262144 2097.152000

     
TiloProbst
Junior Member
Join Date: Jul 2005
Status: Offline
Jan 26, 2006, 09:28 AM
 
@alexkan:

-bash: /Applications/fftw_tests/fft_test: Permission denied

root only or what?
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 26, 2006, 09:29 AM
 
Originally Posted by virex
heres my quad, what's the command to findout the type of ram? i only have ssh access or i'd use the gui...
system_profiler

It gives you the complete system profile - you'll have to extract what you need from that.

HTH,

Ron
     
manu
Guest
Status:
Jan 26, 2006, 10:31 AM
 
Apple Altivec DSP
16384 1129.235692
32768 542.366897
65536 409.200390
131072 363.791673
262144 319.904542
FFTW3 (this may take a minute)
16384 815.559111
32768 542.366897
65536 466.033778
131072 379.272170
262144 385.191184

iMac G4 800Mhz
CPU Type: PowerPC G4 (2.1)
Number Of CPUs: 1
CPU Speed: 800 MHz
L2 Cache (per CPU): 256 KB
Memory: 512 MB SDRAM (PC133-333)
Bus Speed: 100 MHz

@ TiloProbst:

run "chmod 755 <path_to_fft_test>"

I had to do that too
     
manu
Guest
Status:
Jan 26, 2006, 10:34 AM
 
Originally Posted by manu
run "chmod 755 "
should be "chmod 755 path_to_fft_test"

manu
     
virex
Junior Member
Join Date: Feb 2001
Location: Reading
Status: Offline
Jan 26, 2006, 10:36 AM
 
ok my quad has 8 512meg DDR4200 sticks of ram.

Apple Altivec DSP
16384 4893.354667
32768 7864.320000
65536 4194.304000
131072 1980.643556
262144 1110.256941
FFTW3 (this may take a minute)
16384 4893.354667
32768 3932.160000
65536 2396.745143
131072 2228.224000
262144 2097.152000

     
TiloProbst
Junior Member
Join Date: Jul 2005
Status: Offline
Jan 26, 2006, 12:59 PM
 
Originally Posted by manu
should be "chmod 755 path_to_fft_test"

manu
thank you ..

PowerMac G5 Single 1,8 (Summer 2003)
Machine Model: PowerMac7,2
CPU Type: PowerPC 970 (2.2)
Number Of CPUs: 1
CPU Speed: 1.8 GHz
L2 Cache (per CPU): 512 KB
Memory: 512 MB
Bus Speed: 900 MHz

Memory:
DIMM0/J11:
Size: 256 MB
Type: DDR SDRAM
Speed: PC3200U-30330

DIMM1/J12:
Size: 256 MB
Type: DDR SDRAM
Speed: PC3200U-30330

Apple Altivec DSP
16384 3670.016000
32768 3932.160000
65536 1525.201455
131072 1048.576000

FFTW3 (this may take a minute)
16384 3670.016000
32768 2621.440000
65536 1864.135111
131072 1782.579200
     
Todd Madson
Mac Elite
Join Date: Apr 2000
Location: Minneapolis, MN USA
Status: Offline
Jan 26, 2006, 03:37 PM
 
Alexkan: I'll do it tonight and get back with you when I'm back at my G5 DP 2.5.
I'll also do it on my G4/400.
As long as I'm at it I should run it on the wife's iMac 1.25 G4.

Perhaps there will be an "DP G5 client" and a "DC G5 client" then.

If you need any testers, don't hesitate to ask!
( Last edited by Todd Madson; Jan 26, 2006 at 04:13 PM. )
     
Knightrider
Dedicated MacNNer
Join Date: Sep 2004
Location: London
Status: Offline
Jan 26, 2006, 06:17 PM
 
Machine Name: Power Mac G5 Quad
Machine Model: PowerMac11,2
CPU Type: PowerPC G5 (1.1)
Number Of CPUs: 4
CPU Speed: 2.5 GHz
L2 Cache (per CPU): 1 MB
Memory: 2.5 GB
Bus Speed: 1.25 GHz

Size: 256 MB x 2
Type: DDR2 SDRAM
Speed: PC2-4200U-444
Status: OK

Size: 1 GB x 2
Type: DDR2 SDRAM
Speed: PC2-4200U-444
Status: OK

fft test result:-

Standard Ooura

16384 2446.677333
32768 2246.948571
65536 2097.152000
131072 1273.270857
262144 993.387789

Optimized Ooura

16384 2936.012800
32768 2621.440000
65536 1864.135111
131072 1485.482667
262144 993.387789

Apple vBigDSP

16384 3670.016000
32768 3932.160000
65536 4194.304000
131072 2970.965333
262144 993.387789

Apple Altivec DSP

16384 7340.032000
32768 5242.880000
65536 4194.304000
131072 2546.541714
262144 1572.864000

FFTW3 (this may take a minute)

16384 4893.354667
32768 5242.880000
65536 4194.304000
131072 2970.965333
262144 2696.338286


Results for fft_test2 :-

Apple vDSP op / ip
1024 11001.453115 9725.922319
2048 11184.810667 10397.147944
4096 10066.329600 9820.809366
8192 6016.656772 5739.573895
16384 6222.013881 5946.355038
32768 5687.191864 5561.508066
65536 2279.706633 4994.148019
131072 1377.838995 2873.679315
weighted time 9.590000 4.650000 8.280000

FFTW3 interleaved op / ip (this may take a minute)
1024 11184.810667 8285.044938
2048 10397.147944 5592.405333
4096 9820.809366 6242.685023
8192 8077.918815 5521.615392
16384 6348.135784 4997.468596
32768 5719.505455 4770.772322
65536 4329.604129 3741.260711
131072 3186.733765 2910.333388
weighted time 4.330000 4.800000

K.
( Last edited by Knightrider; Jan 31, 2006 at 05:07 AM. )
     
Gecko-R7
Guest
Status:
Jan 26, 2006, 08:15 PM
 
Hi Alex, Here you go.

Machine Name: Power Mac G4
Machine Model: PowerMac3,6
CPU Type: PowerPC G4 (3.3)
Number Of CPUs: 2
CPU Speed: 1.33 GHz
L2 Cache (per CPU): 256 KB
L3 Cache (per CPU): 2 MB
Memory: 1.5 GB
Bus Speed: 167 MHz
Boot ROM Version: 4.4.8f2
Serial Number: XB3241P6PC1

DIMM0/J21:
Size: 512 MB
Type: DDR SDRAM
Speed: PC2600U-25330
Status: OK
DIMM1/J22:
Size: 512 MB
Type: DDR SDRAM
Speed: PC2600U-25330
Status: OK
DIMM2/J23:
Size: 256 MB
Type: DDR SDRAM
Speed: PC2600U-25330
Status: OK
DIMM3/J20:
Size: 256 MB
Type: DDR SDRAM
Speed: PC2600U-25330
Status: OK

Standard Ooura
8 393.216000
16 524.288000
32 748.982857
64 786.432000
128 734.003200
256 838.860800
512 786.432000
1024 806.596923
2048 823.881143
4096 838.860800
8192 717.446737
16384 699.050667
32768 604.947692
65536 430.185026
131072 371.370667
262144 343.170327
Optimized Ooura
8 449.389714
16 524.288000
32 655.360000
64 786.432000
128 734.003200
256 838.860800
512 857.925818
1024 806.596923
2048 823.881143
4096 786.432000
8192 757.304889
16384 734.003200
32768 582.542222
65536 441.505684
131072 356.515840
262144 309.415869
Apple vBigDSP
8 196.608000
16 1048.576000
32 1747.626667
64 2097.152000
128 2446.677333
256 4194.304000
512 3145.728000
1024 2621.440000
2048 2883.584000
4096 2516.582400
8192 1703.936000
16384 978.670933
32768 491.520000
65536 883.011368
131072 660.214519
262144 524.288000
Apple Altivec DSP
8 1048.576000
16 1398.101333
32 5242.880000
64 3145.728000
128 7340.032000
256 8388.608000
512 9437.184000
1024 5242.880000
2048 5767.168000
4096 6291.456000
8192 2726.297600
16384 2446.677333
32768 1429.876364
65536 1118.481067
131072 891.289600
262144 571.950545
FFTW3 (this may take a minute)
8 786.432000
16 1048.576000
32 1310.720000
64 1572.864000
128 3670.016000
256 4194.304000
512 4718.592000
1024 5242.880000
2048 2883.584000
4096 2516.582400
8192 1947.355429
16384 1631.118222
32768 1209.895385
65536 1048.576000
131072 775.034435
262144 725.937231
     
Todd Madson
Mac Elite
Join Date: Apr 2000
Location: Minneapolis, MN USA
Status: Offline
Jan 26, 2006, 08:55 PM
 
This is a G5 2.5 dual processor model with 2.5 gigabytes of ram installed.
It is running 10.44 and has been up for 5 days, 19 hours, and 58 minutes.
Memory is PC3200U-30330.

Machine Name: Power Mac G5
Machine Model: PowerMac7,3
CPU Type: PowerPC G5 (3.0)
Number Of CPUs: 2
CPU Speed: 2.5 GHz
L2 Cache (per CPU): 512 KB
Memory: 2.5 GB
Bus Speed: 1.25 GHz
Boot ROM Version: 5.1.8f7

Apple Altivec DSP

8 3145.728000
16 4194.304000
32 5242.880000
64 6291.456000
128 inf
256 8388.608000
512 9437.184000
1024 10485.760000
2048 11534.336000
4096 12582.912000
8192 6815.744000
16384 4893.354667
32768 3145.728000
65536 2097.152000
131072 1273.270857
262144 1110.256941

FFTW3 (this may take a minute)

8 3145.728000
16 4194.304000
32 2621.440000
64 6291.456000
128 3670.016000
256 8388.608000
512 9437.184000
1024 10485.760000
2048 5767.168000
4096 6291.456000
8192 6815.744000
16384 4893.354667
32768 3145.728000
65536 2796.202667
131072 2228.224000
262144 2359.296000
( Last edited by Todd Madson; Jan 27, 2006 at 08:45 AM. )
     
TiloProbst
Junior Member
Join Date: Jul 2005
Status: Offline
Jan 27, 2006, 08:25 AM
 
could you please cut your postings to those parts that alex said to be really relevant?
     
virex
Junior Member
Join Date: Feb 2001
Location: Reading
Status: Offline
Jan 27, 2006, 09:11 AM
 
alex,
what do those numbers mean? is higher better or lower?

     
alexkan  (op)
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Jan 27, 2006, 02:02 PM
 
Originally Posted by virex
alex,
what do those numbers mean? is higher better or lower?
Higher is better.

I'm beginning to think that I might need a bit more information to figure out whether or not switching to FFTW is a good idea. It's actually not quite as simple as including more lines from the output--I need something both narrower and deeper. So there'll be a new test binary in a bit, once I've figured out exactly what I need.

I expect the new benchmark binary to take much longer than the older binary to run. It's also important that you have nothing open while the program is running, since the effectiveness of FFTW's optimization is dependent on its ability to make judgments about relative performance of different techniques. (In my test run, I even shut down MenuMeters--not sure if it affects FFTW's plan generation, but it certainly changes the benchmark results.) On the bright side, some of the data generated by the benchmark run is reusable in the future (other benchmark runs, FFTW-based SETI workers), so it's important to do this right once.

Of course, I realize that you actually use your Macs for work, so it would be unreasonable of me to expect results immediately, not to mention that if they come in more slowly, I'll have more time to mull over the results.
     
beadman
Dedicated MacNNer
Join Date: Nov 2004
Location: Virginia
Status: Offline
Jan 28, 2006, 01:02 AM
 
Apple Altivec DSP
...
16384 3670.016000
32768 2246.948571
65536 932.067556
131072 685.607385
262144 539.267657
FFTW3 (this may take a minute)
...
16384 3670.016000
32768 1572.864000
65536 986.895059
131072 775.034435
262144 754.974720

Machine Name: PowerBook G4 15"
Machine Model: PowerBook5,6
CPU Type: PowerPC G4 (1.2)
Number Of CPUs: 1
CPU Speed: 1.67 GHz
L2 Cache (per CPU): 512 KB
Memory: 1 GB
Bus Speed: 167 MHz
Boot ROM Version: 4.9.1f1

SODIMM0/J25LOWER:
Size: 512 MB
Type: DDR SDRAM
Speed: PC2700U-25330
Status: OK
SODIMM1/J25UPPER:
Size: 512 MB
Type: DDR SDRAM
Speed: PC2700U-25330
Status: OK
     
TiloProbst
Junior Member
Join Date: Jul 2005
Status: Offline
Jan 28, 2006, 11:05 AM
 
I just got this one in my Terminal ouput and wonder what it means:

2006-01-28 14:58:44 [---] request_reschedule_cpus: process exited
2006-01-28 14:58:44 [SETI@home] Computation for result 20se04aa.18132.8752.509654.1.161_1 finished
2006-01-28 14:58:44 [SETI@home] Starting result 17se04aa.6145.5456.378418.1.212_7 using setiathome version 418
2006-01-28 14:58:45 [SETI@home] Unrecoverable error for result 17se04aa.6145.5456.378418.1.212_7 (process exited with code 250 (0xfa))
2006-01-28 14:58:45 [SETI@home] Unrecoverable error for result 17se04aa.6145.5456.378418.1.212_7 (process exited with code 250 (0xfa))

2006-01-28 14:58:45 [---] request_reschedule_cpus: process exited
2006-01-28 14:58:45 [SETI@home] Computation for result 17se04aa.6145.5456.378418.1.212_7 finished
     
Karl Schimanek
Junior Member
Join Date: Oct 2004
Location: Germany
Status: Offline
Jan 28, 2006, 11:09 AM
 
User of the day, SETI HP

Regards
Karl
     
alexkan  (op)
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Jan 30, 2006, 06:26 AM
 
Same song, second verse...

http://inst.eecs.berkeley.edu/~alexk...i/fft_test2.gz

As mentioned before, this one will take longer, and the results will matter, so make sure you run it with absolutely as little in the background as possible. This benchmark program generates a file called bigfft_wisdom in the directory where it was run. Delete that file after a run you think was sub-optimal, since the wisdom (FFT performance hints) is saved for subsequent runs.

Post the entire output here, along with system information (or edit your previous posts, that works too). If you know for sure that the wisdom you've generated is good and your machine is not a Hi-Res PowerBook 1.67 or a G5 1.8 Dual, PM me about sending me the wisdom file you've generated.

Most of the numbers are the same as before, but you'll notice that there's now two columns for each FFT library (corresponding to out-of-place and in-place FFTs) and a "weighted time" field. These ought to be roughly proportional to the amount of time required to do all the FFTs in a work unit, although the numbers themselves have no meaningful units. Lower is better in this regard, of course. The third vDSP number corresponds to how alpha-5 works, since it uses a mix of in-place and out-of-place FFTs.

A vDSP/FFTW mixed client now exists, compiles, and (I think) works. I'm not releasing it yet because I haven't tested it enough and I want Rick to take a look at it as well. DP G5 owners should probably keep an eye out for its release--you're going to like the performance increases you'll get if we release this.
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 30, 2006, 08:51 AM
 
Alex, I just added the first fft_test2-results to my previous fft_test-posts. Will add more as time progresses, and will list here what's done. All tests will be carried out with the machine in question sitting at the login-window, running fft_test2 via ssh from a remote machine. All are running Tiger with 'Processor Performance' set to 'Highest' and 'Nap' enabled, unless otherwise noted (as before).

Machines done:

- PM G5 DP 2.5 GHz
- PM G5 Quad 2.5 GHz
- XS G5 DP 2.0 GHz (OSXS 10.3.9)
- XS G5 2.0 GHz
- PM G4 DA w/ GD DP 1.33 GHz upgrade
- MM G4 1.42 GHz
- PB G4 17" 1.5 GHz
- MM G4 1.25 GHz
- PB G3 FW w/ WM G4 550 MHz upgrade
- PM G4 DA DP 533 MHz
( Last edited by halimedia; Jan 30, 2006 at 05:27 PM. )
     
TiloProbst
Junior Member
Join Date: Jul 2005
Status: Offline
Jan 30, 2006, 02:20 PM
 
PowerMac G5 Single 1,8 (Summer 2003)
Machine Model: PowerMac7,2
CPU Type: PowerPC 970 (2.2)
Number Of CPUs: 1
CPU Speed: 1.8 GHz
L2 Cache (per CPU): 512 KB
Memory: 512 MB
Bus Speed: 900 MHz

pretty much out-of-the-box Mac OS X.4.0; Power: Automatically; logged in with >console, then remotely started via ssh

Memory:
Size: 256 MB
Type: DDR SDRAM
Speed: PC3200U-30330

Size: 256 MB
Type: DDR SDRAM
Speed: PC3200U-30330

Last login: Mon Jan 30 17:11:59 2006 <- now 19:17 .. o_O two hours to complete

Apple vDSP op / ip
1024 7374.600440 7713.662529
2048 7853.164936 7456.540444
4096 7064.090947 6710.886400
8192 4174.235560 4020.346691
16384 3914.683733 3964.236692
32768 2001.258370 3041.187190
65536 951.055646 1497.547872
131072 918.559330 1017.708018
weighted time 15.600000 13.250000 12.420000

FFTW3 interleaved op / ip (this may take a minute)
1024 7803.356279 6100.805818
2048 7030.452419 4101.097244
4096 6547.206244 4449.206453
8192 5192.947810 3575.472262
16384 4032.292258 3285.049287
32768 2574.508849 2419.790769
65536 1890.390535 1680.347142
131072 1777.025994 1692.656807
weighted time 8.120000 8.650000
     
alexkan  (op)
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Jan 30, 2006, 04:00 PM
 
PM or IM me if you're interested in trying an experimental beta of the vDSP/FFTW mixed client. Extra preference given to those who have already run fft_test2 and have wisdom information already available. Some preference given to those who have a machine for which fft_test2 benchmark data has been posted.

Hint: it's fast. Only two data points right now, but on the reference unit, DP G5s are doing ~40 minutes where they once did ~60, and DC G5s are doing ~25 minutes where they once did 35-40. The only thing I don't know is if I broke anything by accident.

Edit: Actually, blah. Quads are memory bandwidth-limited. Might still be a little bit faster, though. I still need more DC numbers!
( Last edited by alexkan; Jan 30, 2006 at 04:07 PM. )
     
amigoivo
Fresh-Faced Recruit
Join Date: Sep 2005
Location: Germany
Status: Offline
Jan 30, 2006, 04:59 PM
 
Hello alexkan,

here are my results:
Standard Ooura
16384 2446.677333
32768 2246.948571
65536 2396.745143
131072 1485.482667

Optimized Ooura
16384 2446.677333
32768 2246.948571
65536 2396.745143
131072 1485.482667

Apple vBigDSP
16384 2936.012800
32768 3932.160000
65536 4194.304000
131072 2970.965333

Apple Altivec DSP
16384 7340.032000
32768 5242.880000
65536 5592.405333
131072 2970.965333

FFTW3 (this may take a minute)
16384 4893.354667
32768 3932.160000
65536 4194.304000
131072 2970.965333
And here are the results of test 2:
Apple vDSP op / ip
1024 11001.453115 10016.248358
2048 11017.873194 10545.678629
4096 9942.053926 9256.395034
8192 5894.697514 5777.584318
16384 6305.530846 5984.229911
32768 5298.068211 5500.726557
65536 2365.070097 4858.560290
131072 1658.213209 2925.258174
weighted time 8.160000 4.590000 6.880000
FFTW3 interleaved op / ip (this may take a minute)
1024 11184.810667 8388.608000
2048 10397.147944 5592.405333
4096 9702.486361 6291.456000
8192 8003.809468 5486.888252
16384 6140.680366 4971.026963
32768 5561.508066 4534.382703
65536 4161.790016 3508.960209
131072 3151.521238 2789.365985
weighted time 4.400000 5.030000
PowerMac QUAD G5
3GB PC4200 444

I hope you can use it.

Greetz, Ivo

@Karl
I read the whole thread and tested the first test i've found.
Der zweite Test lief bereits kurz nach meinem Post, hat mir dann aber zu lange gedauert.
( Last edited by amigoivo; Jan 30, 2006 at 07:03 PM. )
Seti@home + Einstein@home + climateprediction.net
PowerMac QUAD G5 2,5GHz / 3GB RAM / 250GB HD
     
Karl Schimanek
Junior Member
Join Date: Oct 2004
Location: Germany
Status: Offline
Jan 30, 2006, 05:13 PM
 
     
Todd Madson
Mac Elite
Join Date: Apr 2000
Location: Minneapolis, MN USA
Status: Offline
Jan 30, 2006, 11:51 PM
 
Gods, I can't believe it.

I ran the thing after freshly rebooting and came home later to find the fans going
on and off - turns out I ran the damn thing with two copies of Alpha5 running.

Logging into another account because I can't seem to stop the damn thing and
will run it again. Stay tuned.

-Update - holy cats:

Apple vDSP op/ip
1024 11374.383729 10485.760000
2048 11356.884677 10855.845647
4096 9474.192565 8053.063680
8192 5777.584318 5728.805463
16384 5494.292959 3947.580235
65536 1241.320028 1890.390535
131072 1187.149519 1278.980592

Weighted Time 12.180000 10.530000 9.610000

FFTW3 Interleaved op/ip
1024 11484.810667 8603.700513
2048 10397.147944 5722.461271
4096 9702.486361 6340.995024
8192 7652.765193 5521.615392
16384 5763.951509 4349.648593
32768 3756.093134 3145.728000
65536 2479.773266 2270.067281
131072
( Last edited by Todd Madson; Jan 31, 2006 at 07:22 AM. )
     
Drash
Fresh-Faced Recruit
Join Date: Aug 2001
Location: UK
Status: Offline
Jan 31, 2006, 01:22 AM
 
fft_test2 edited into my previous post.

Any chance of some new threads for each test say? - after all this forum seems wasted if most traffic actually goes in to one of a couple of threads. I know it is connected but just seems pedantic to have a 500+ post thread in a forum with so little other traffic. Cheers
     
alexkan  (op)
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Jan 31, 2006, 03:19 AM
 
Alright, I'll put future tests in their own threads, since the benchmark results tend to get pretty bloated. However, I still have more business with this thread...with alpha-6!

To begin with, alpha-6 is targeted towards a particular group of users that will actually benefit from its changes. Specifically, alpha-6 is only for G4 users whose processors have less than 1 MB of L2 or L3 cache and all G5 users, dual-core or dual-processor (or even single-processor). It's now a fat binary for both PPC and PPC970, so there's only one binary to download. As before, I don't have a Panther-based Mac, so I can only guarantee this runs for Tiger only.

Replacing an existing client is the same procedure as before, except for one thing. Since this version now uses FFTW, you'll need a wisdom file, either one that you've generated yourself by running fft_test2, or by contacting someone with the same model as you who's posted benchmarks, and getting them to send them to you. I will eventually post the ones I've collected. (They're named 'bigfft_wisdom', if you hadn't noticed them before.) You'll need to place bigfft_wisdom in /Library/Application Support/BOINC Data. If you've got your BOINC support files installed somewhere else...I'll have to figure out how to deal with you, since the wisdom path is currently hard-coded.

Having this wisdom file in place is very important! If you don't have an appropriate wisdom file, alpha-6 may well be slower than alpha-5! You'll know if your client is set up properly by looking in the stderr output of completed work units for the phrase "successfully loaded FFTW wisdom from bigfft_wisdom". We finally print something!

Happy crunching, and let me know if alpha-6 is an improvement for you!

Edit: I've pulled the release of alpha-6 because of validation issues. Those of you who already have the client, please send me your work units. I'll repost a binary once I've fixed the problems.
( Last edited by alexkan; Jan 31, 2006 at 05:07 PM. )
     
virex
Junior Member
Join Date: Feb 2001
Location: Reading
Status: Offline
Jan 31, 2006, 09:35 AM
 
all my CLI computers have the data files at ~/boinc including my quad. Is there a chance of getting that added to the directories that a6 checks? also is there any way to have this client know that there's no wisdom file and auto run the ffttest? or would that be too much of a mod to the seti code?

     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 31, 2006, 09:46 AM
 
Originally Posted by virex
all my CLI computers have the data files at ~/boinc including my quad. Is there a chance of getting that added to the directories that a6 checks?
On mine, it's in /Library/boinc/ - I don't think it makes much sense for the worker to check a bunch of different directories for wisdoms. A config file would be much handier, IMO - maybe it could even be rolled into app_info.xml?

also is there any way to have this client know that there's no wisdom file and auto run the ffttest? or would that be too much of a mod to the seti code?
I don't think that would be too clever, since the machine should otherwise be idling when fft_test2 runs, and automatic execution could therefore result in a poor wisdom if the user is unaware of this requirement. Either a controlled execution of fft_test2 or a download of wisdom files (once they become available) are probably the best solutions, the way I see it. But I'm sure Alex can chime in with more wisdom-wisdom

Happy crunching!

Ron
     
Mark Asiala
Guest
Status:
Jan 31, 2006, 01:11 PM
 
My first 4 results using the a6 client are all in the 2300-2400 s range as compared to 3500-3600s using a5. 50% speed increase, great!!

They have not been validated yet but this is the speed I was hoping for with my G5. My G4 with the 1MB L2 had otherwise improved over the stock client disproportionately as compared to my DP 2.5 G5.

-Mark
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 31, 2006, 01:25 PM
 
Originally Posted by Mark Asiala
My first 4 results using the a6 client are all in the 2300-2400 s range as compared to 3500-3600s using a5. 50% speed increase, great!!
What model G5 is it? Edit: Are you running SETI on both CPUs if it's the DP 2.5 you mention below?

They have not been validated yet but this is the speed I was hoping for with my G5. My G4 with the 1MB L2 had otherwise improved over the stock client disproportionately as compared to my DP 2.5 G5.
Am I assuming correctly that you are running alpha-6 on the G4? If so, I'd strongly advise you to use alpha-5 with this machine. My experience with alpha-6 so far shows that G4s with ≥1MB cache suffer significantly decreased performance compared to alpha-5...

HTH,

Ron
     
alexkan  (op)
Forum Regular
Join Date: Aug 2005
Location: Cupertino, CA
Status: Offline
Jan 31, 2006, 01:46 PM
 
Originally Posted by halimedia
On mine, it's in /Library/boinc/ - I don't think it makes much sense for the worker to check a bunch of different directories for wisdoms. A config file would be much handier, IMO - maybe it could even be rolled into app_info.xml?
This has the potential to get messy really fast. The idea is that reading a config file would be just as tricky as reading bigfft_wisdom, since the issue is figuring out where the file is so that I can open it. I'm going to have to spend more time with the BOINC API before I know what's going on. If any of you know of any documentation for app_info.xml, please let me know, since it would be nice to just bundle the file in the same directory.
I don't think that would be too clever, since the machine should otherwise be idling when fft_test2 runs, and automatic execution could therefore result in a poor wisdom if the user is unaware of this requirement. Either a controlled execution of fft_test2 or a download of wisdom files (once they become available) are probably the best solutions, the way I see it. But I'm sure Alex can chime in with more wisdom-wisdom
When there's no bigfft_wisdom file, the client uses plan generation that takes less time but generates less optimal (read: slower) FFTs. Until I figure out a better way to take care of file paths and can actually write out accumulated wisdom, spending more time generating FFT plans for each work unit won't be a good tradeoff for work unit completion time.
     
Todd Madson
Mac Elite
Join Date: Apr 2000
Location: Minneapolis, MN USA
Status: Offline
Jan 31, 2006, 02:24 PM
 
Alex & all:

Ran the fft_test2 four times in the last 24 hours and ran into a weird issue every time
I tried to run it in >console mode rather than the gui for best results: At the end it
hangs the machine with a blue screen and creates the spinny round graphic in the
center - I left the house for work this morning and stopped by at noon to check it
out and it was still spinning.

Edit: it runs thru the lines I entered above and I have no idea what happens after,
if it reaches the last line or if it crashes at some point during this I have no idea.
I can run it again if you like.

Update: The fft_wisdom file I generated was when I had two copies of A5 going as
I discussed previously.

I'll login to the gui on an account that loads nothing at launch and run it there and
hope it generates the file.
( Last edited by Todd Madson; Jan 31, 2006 at 02:44 PM. )
     
halimedia
Dedicated MacNNer
Join Date: Oct 2005
Location: Switzerland
Status: Offline
Jan 31, 2006, 02:58 PM
 
Todd: >console sometimes behaves oddly - e.g. when you have Timbuktu Pro installed and set to load at startup. Same might apply to VNC or ARD, and possibly with other GUI-dependent startup services.

I thought about the best way to create a wisdom-file. A safe boot (press shift on boot) might be best. And then either use >console or ssh from another machine to run fft_test2.
     
Mark Asiala
Fresh-Faced Recruit
Join Date: Jan 2006
Location: Maryland
Status: Offline
Jan 31, 2006, 02:59 PM
 
Originally Posted by halimedia
What model G5 is it? Edit: Are you running SETI on both CPUs if it's the DP 2.5 you mention below?

Am I assuming correctly that you are running alpha-6 on the G4? If so, I'd strongly advise you to use alpha-5 with this machine. My experience with alpha-6 so far shows that G4s with ≥1MB cache suffer significantly decreased performance compared to alpha-5...
I am running alpha-6 on a DP 2.5 G5 / 0.5 MB L2 / 1.5 MB RAM, using both processors. I am running alpha-5 on a G4-500 / 1MB L2.

With the stock client, the ratio of the G4 processing time to the G5 processing time was closer to the 4-5x you would expect. With alpha-5, that ratio was about 2.5x since the G5 was not getting sped up as much as the G4. Now the G5 / alpha-6 is closer to 4x faster than the G4 / alpha-5.
     
Knightrider
Dedicated MacNNer
Join Date: Sep 2004
Location: London
Status: Offline
Jan 31, 2006, 03:44 PM
 
Originally Posted by alexkan

If any of you know of any documentation for app_info.xml, please let me know, since it would be nice to just bundle the file in the same directory.
If it helps, there are some here:-

http://www.google.com/search?domains...q=app_info.xml

K.
     
Knightrider
Dedicated MacNNer
Join Date: Sep 2004
Location: London
Status: Offline
Jan 31, 2006, 03:45 PM
 
Sorry about the double post.

K.
     
E.T from tellus
Fresh-Faced Recruit
Join Date: Nov 2005
Location: Finland, Tuusula
Status: Offline
Jan 31, 2006, 04:54 PM
 
Wow over 20% improvement in Quad
1,898.82sec (validated) comparing to 2475sec (long time average) =23.331%
http://setiathome.berkeley.edu/show_...hostid=1781916
PS: New Estimated RAC is 5111
Thanks Alex & Rick
     
 
 
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Top
Privacy Policy
All times are GMT -4. The time now is 05:23 PM.
All contents of these forums © 1995-2017 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.8 © 2000-2017, Jelsoft Enterprises Ltd.,