OK - Pirates is a closed ship at the moment I guess, so here is a summery. This is happening on my Quad
I discovered that when running two or more projects - one or two projects on their own seem to run ok - with the project preferences set to 'Leave applications in memory while preempted= yes, then these error messages started to occur for all the projects - I ran up to five projects. One project fails then other projects fail in turn.
On this occasion it started when I attached to Pirates as my third project. Not all wu's failed at the same time, it was progressive and some succeeded and some did not. Pirates were sending very short wu's, less than a minuet, so I detached from them thinking this might be the problem, but the problem continued.
2006-05-22 04:46:08 [Pirates@Home] Can't create shared memory: system shmget
2006-05-22 04:46:08 [Pirates@Home] Unrecoverable error for result wu_1148262635_451_0 (Couldn't start or resume: -144)
2006-05-22 04:46:08 [Pirates@Home] Can't create shared memory: system shmget
2006-05-22 04:46:08 [Pirates@Home] Unrecoverable error for result wu_1148262635_452_0 (Couldn't start or resume: -144)
2006-05-22 04:46:08 [Pirates@Home] Can't create shared memory: system shmget
2006-05-22 04:46:08 [Pirates@Home] Unrecoverable error for result wu_1148262635_453_0 (Couldn't start or resume: -144)
2006-05-22 04:46:08 [Pirates@Home] Can't create shared memory: system shmget
2006-05-22 04:46:08 [Pirates@Home] Unrecoverable error for result wu_1148262635_454_0 (Couldn't start or resume: -144)
2006-05-22 04:46:08 [Pirates@Home] Unexpected state 7 for task wu_1148262635_451_0
2006-05-22 04:46:08 [Pirates@Home] Unexpected state 7 for task wu_1148262635_452_0
2006-05-22 04:46:08 [Pirates@Home] Unexpected state 7 for task wu_1148262635_453_0
2006-05-22 04:46:08 [Pirates@Home] Unexpected state 7 for task wu_1148262635_454_0
The Captain of Pirates has taken an interest and working with him, we determined that I had plenty of RAM and Virtual memory and it occured to me to check my project preference setting for memory so I switched the project preferences (for all projects) to 'Leave applications in memory while preempted= No'. Closed BOINC and re-booted the machine. No more error messages. Changing preference back to = 'Yes' and the error messages slowly resumed.
There was a little side trip investigating 'Slots' and John McLeod has kindly explained to me how slots work. I copy it here for the sake of compleatness and general edification.
Slots are created and used as needed.
If you are attached to say 30 projects, but only have at most work from 5
of them n hand at any time, and you never have more than 1 result running
from each project, then you will have 5 slots.
On the other hand, if you have a project with mixed deadlines, it may
start a result with a distant deadline, and then pre-empt that one to
start one with a near deadline. So even on a single CPU system, it is
possible to have 2 slots for a single project.
It is also possible to have an extra slot if it cannot be cleaned up for
some reason (BURP was infamous for this problem).
In your case, what it means:
You have never had more than 16 results running or pre-empted on your
system at the same time. You currently have 15. Rosetta has used 2 more
slots than there are CPUs. Either they could not be cleaned up, or it has
had to do some pre-emption to meet deadlines. None of the others really
needs much explaining.
jm7
The Captain, a knowledgable fellow it has to be said, suggested that
Each task uses a small bit of shared memory, and in Unix there is a kernel parameter which limits the total amount of shared memory which may be used. The Activity Monitor on Mac is very useful, but it does not show this limit (as best I can tell). To see what the limit is on your machine you need to open a Unix command shell (the Terminal app on a Mac) and give the command
sysctl -A | grep shmmax
On both Tiger and Panther machines I found a limit of only about 4MB. On two Linux boxes (Fedora Core 3 and 4) I found it was more like 32MB.
So even though you have lots of main memory available, I can see how you might hit the limit on shared memory, if it's only 4MB on a Mac and you have 16 tasks running or suspended but still in memory.
At the end of the results produced running this process in Terminal, I had :-
kern.sysv.shmmax: 4194304
This is the size of the shared memory.
I looked around the web for references to Shared Memory on Mac os and found two of interest.
I noted
this
"The disadvantage of shared memory is that it is very fragile. When a data structure in a shared memory region becomes corrupt, all processes that refer to the data structure are affected."
This page also makes the same point.
"Shared memory is fragile. If one program corrupts a section of shared memory, any programs that also use that memory share the corrupted data."
The Captain thinks that there is a way to increase the size of the shared memory and is investigating as, due to a bug, a workaround is required.
That's it. I will let you know of any further developments.
If anyone has the time and wants to play, it would be interesting to know if they can reproduce this problem on their machine; or is it just me.
K.