|
|
Tracking down the PPP hang
|
|
|
|
Addicted to MacNN
Join Date: Mar 2000
Location: London, UK
Status:
Offline
|
|
ONLY FOLLOW THESE INSTRUCTIONS IF YOU ARE COMFORTABLE WITH THE TERMINAL. SINCE YOU ARE MESSING WITH LOW-LEVEL BITS OF THE SYSTEM, IF YOU DON'T FULLY UNDERSTAND WHAT THESE INSTRUCTIONS TELL YOU TO DO YOU CAN SCREW UP YOUR INSTALLATION OF MAC OS X. You have been warned.
Continuing the search for the (real) cause.
Latest package at http://homepage.mac.com/fdobbie. Make sure you read the warnings etc etc.
There are many more lines this time, starting with the text "fnarked", like this:
Feb 17 23:30:48 localhost pppd[372]: pppd 2.4.0 started by root, uid 0
Feb 17 23:30:48 localhost mach_kernel: fnarked: 0s at acquirePort start
Feb 17 23:30:48 localhost mach_kernel: fnarked: 0s at for(; - after readPortState
Feb 17 23:30:48 localhost mach_kernel: fnarked: 0s at for(; - after changePortState
Feb 17 23:30:48 localhost mach_kernel: fnarked: 0s at do {
Feb 17 23:30:48 localhost mach_kernel: fnarked: 0s at do :: 1
Feb 17 23:30:48 localhost mach_kernel: fnarked: 0s at do :: 2
Feb 17 23:30:48 localhost mach_kernel: fnarked: 0s at do :: 3
Feb 17 23:30:48 localhost mach_kernel: fnarked: 0s at do :: 4
Feb 17 23:30:48 localhost mach_kernel: fnarked: 0s at do :: 5
Feb 17 23:30:48 localhost mach_kernel: AppleSCCModem(174d680): setPowerState is called -- powerStateOrdinal = 1
Hope this will narrow it down to at least one or two function calls within acquirePort.
As before, please only reply if you are going to be genuinely useful, not give mama's home-baked fix.
|
|
|
|
|
|
|
|
|
Addicted to MacNN
Join Date: Mar 2000
Location: London, UK
Status:
Offline
|
|
I got a brief (~1s) hang, and appeared between do :: 2 and do :: 3. Looks like it's somewhere in OpenScc(). Be nice if someone else would verify that, though
|
|
|
|
|
|
|
|
|
Moderator Emeritus
Join Date: Mar 2001
Location: Austin, MN, USA
Status:
Offline
|
|
Is there any way to avoid rebooting? Isn't there a process I can restart rather than the whole computer?
|
|
|
|
|
|
|
|
|
Addicted to MacNN
Join Date: Mar 2000
Location: London, UK
Status:
Offline
|
|
This is what I use:
sudo -s
kextunload -m com.apple.driver.AppleSCCSerial
kextload /System/Library/Extensions/AppleSCCSerial.kext
but rebooting is the safest way to avoid **** -ups on your part
Anyway, it actually looks like it's happening in SccSetupReceptionChannel. If you can just confirm it happens between 2 and 3 for you, I'll rejig the logging to be called from there. We're getting closer. I can smell it.
|
|
|
|
|
|
|
|
|
Moderator Emeritus
Join Date: Mar 2001
Location: Austin, MN, USA
Status:
Offline
|
|
That's pretty simple. I'll just write a shell script with the unload/load commands in it.
Anyway, I just restart before this was posted so it's loaded as of now. When I get a hang, I'll post where it came from.
|
|
|
|
|
|
|
|
|
Moderator Emeritus
Join Date: Mar 2001
Location: Austin, MN, USA
Status:
Offline
|
|
Between 2 and 3:
Feb 18 00:20:29 s003 pppd[741]: pppd 2.4.0 started by root, uid 0
Feb 18 00:26:17 s003 mach_kernel: fnarked: 0s at acquirePort start
Feb 18 00:26:17 s003 mach_kernel: fnarked: 0s at for(; - after readPortState
Feb 18 00:26:17 s003 mach_kernel: fnarked: 0s at for(; - after changePortState
Feb 18 00:26:17 s003 mach_kernel: fnarked: 0s at do {
Feb 18 00:35:35 s003 mach_kernel: fnarked: 0s at do :: 1
Feb 18 00:35:35 s003 mach_kernel: fnarked: 0s at do :: 2
Feb 18 00:35:35 s003 mach_kernel: fnarked: 905s at do :: 3
Feb 18 00:35:35 s003 mach_kernel: fnarked: 905s at do :: 4
Feb 18 00:35:35 s003 mach_kernel: fnarked: 905s at do :: 5
Feb 18 00:35:35 s003 mach_kernel: AppleSCCModem(3319e00): setPowerState is called -- powerStateOrdinal = 1
Here's another:
Feb 18 01:47:27 s003 mach_kernel: fnarked: 0s at do :: 2
Feb 18 01:47:27 s003 mach_kernel: fnarked: 455s at do :: 3
And another (even though you don't need it):
Feb 18 01:54:24 s003 mach_kernel: fnarked: 0s at do :: 2
Feb 18 01:54:24 s003 mach_kernel: fnarked: 301s at do :: 3
Last one (got these last two while finding out that reloading the PPP extension will not stop the hang, hehe.):
Feb 18 01:58:23 s003 mach_kernel: fnarked: 0s at do :: 2
Feb 18 01:58:23 s003 mach_kernel: fnarked: 173s at do :: 3
OK, here's another, but this one was because I had to test something else. Is it just me, or is it weird how the times of the hang from the above ones seemed to almost halve themselves? Then I got this:
Feb 18 02:00:21 s003 mach_kernel: fnarked: 0s at do :: 2
Feb 18 02:00:21 s003 mach_kernel: fnarked: 0s at do :: 3
And right after that I got this:
Feb 18 02:08:49 s003 mach_kernel: fnarked: 0s at do :: 2
Feb 18 02:08:49 s003 mach_kernel: fnarked: 495s at do :: 3
And then this:
Feb 18 02:11:09 s003 mach_kernel: fnarked: 0s at do :: 2
Feb 18 02:11:09 s003 mach_kernel: fnarked: 0s at do :: 3
Man, it just can't decide how long it's going to hang for. This post is officially too long. Maybe I'll shorten it--I just did.
[ 02-18-2002: Message edited by: Xeo ]
|
|
|
|
|
|
|
|
|
Addicted to MacNN
Join Date: Mar 2000
Location: London, UK
Status:
Offline
|
|
I've started getting it again (although thankfully not that long, only about 7 seconds), meaning I've nailed down exactly where it happens. Here's a copy of the e-mail I just sent to various people including the darwin-development mailing list and all the relevant engineers:
Hi all,
I've been working on tracking down this issue for a while now, and I've narrowed it down quite considerably from the previous "it's happening in acquirePort". Ron, could you add this in comments to the right radar, I forget the number offhand. Also posting to darwin-development because it's happening in a generic IOMallocContiguous call. WHY DOES THIS TAKE 7 SECONDS (and longer for some people, up to 10 or 20 minutes)? :-)
Perhaps dmaInfo->dmaNumberOfDescriptors randomly becomes exorbitantly large, or something.
I'd also appreciate it if you people could KEEP ME INFORMED instead of just swallowing up my input like usual. Oh, and I'd like to ask why you couldn't do something like this yourselves.
----
It's actually happening on the line in SccSetupReceptionChannel (SccChipPrimatives.cpp in AppleSCCSerial):
dmaInfo->dmaChannelCommandArea = (IODBDMADescriptor *)IOMallocContiguous(sizeof(IODBDMADescriptor) * dmaInfo->dmaNumberOfDescriptors, PAGE_SIZE, NULL);
Found this out like so:
AbsoluteTime timeStart;
clock_get_uptime(&timeStart);
...
fnarkTheFnoodle(timeStart, "SccSetupReceptionChannel :: 3");
dmaInfo->dmaChannelCommandArea = (IODBDMADescriptor *)IOMallocContiguous(sizeof(IODBDMADescriptor) * dmaInfo->dmaNumberOfDescriptors, PAGE_SIZE, NULL);
fnarkTheFnoodle(timeStart, "SccSetupReceptionChannel :: 4");
and fnarkTheFnoodle is like this:
void fnarkTheFnoodle(AbsoluteTime timeStart, const char *description) {
UInt64 timeElapsed; // in nanoseconds
AbsoluteTime timeStop;
clock_get_uptime(&timeStop);
SUB_ABSOLUTETIME(&timeStop, &timeStart);
absolutetime_to_nanoseconds(timeStop, &timeElapsed);
timeElapsed /= 1000000000;
IOLog("fnarked: %us at %s\n", (unsigned)timeElapsed, description);
}
giving the following output in system.log:
Feb 18 20:03:45 localhost mach_kernel: fnarked: 0s at SccSetupReceptionChannel :: 3
Feb 18 20:03:45 localhost mach_kernel: fnarked: 7s at SccSetupReceptionChannel :: 4
----
Well, I hope that helps somewhat.
-- Finlay
|
|
|
|
|
|
|
|
|
Mac Enthusiast
Join Date: Nov 2001
Location: Adelaide, South Australia
Status:
Offline
|
|
Finlay,
this is fantastic news: my sincere thanks for putting in the hard yards on this one. Although I'm not being bitten too badly these days it's been both (a) something of a public embarrassment that the platform will be well rid of, and (b) a monumental pain for those getting the 4 -> 30 minute stalls on a regular basis.
Best Wishes,
Paul
|
|
|
|
|
|
|
|
|
Grizzled Veteran
Join Date: Mar 2001
Status:
Offline
|
|
Somebody rocks and his name is Angus_D!
When I can finally leave my computer on 24-7 (w/sleep, of course) and not have to take a 2 hr nap whenever I connect to the internet, OS X is finally going to rock my world.
Apple should hire Angus_D and put him in charge of open transport. Better yet, Apple should hire Angus_D so he can go bitch-slap all the buttmunches at Apple who left this bug in OS X for so long!
|
[FONT="book antiqua"]"If a nation expects to be ignorant and free, in a state of civilization, it expects what never was and never will be."
- Thomas Jefferson, 1816.[/FONT]
|
|
|
|
|
|
|
|
Junior Member
Join Date: Oct 2001
Status:
Offline
|
|
Originally posted by OverclockedHomoSapien:
<STRONG>Apple should hire Angus_D and put him in charge of open transport. Better yet, Apple should hire Angus_D so he can go bitch-slap all the buttmunches at Apple who left this bug in OS X for so long!</STRONG>
Yeah, big time props . I was pretty excited to see his post on the darwin-development list, but feedback (from said buttmunches ) was that the problem he identified may not be the complete solution. It does highlight a flaw (IMO) in the way memory is allocated by the operating system though, so when/if they fix that it should benefit any program that needs a contiguous block of memory.
To be fair, the Apple guys seem sincerely nice, but this has obviously not been a priority for them since they never use PPP. Up until now most OS X users have been early adopters who are also more likely to be broadband users, but now that it's the default OS they need to make sure that this sort of thing doesn't exist. Hopefully this has been what is holding up 10.1.3 and Angus_D's hard work was enough to get it out the door (or at least shame them into allocating more resources to it).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Forum Rules
|
|
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is Off
|
|
|
|
|
|
|
|
|
|
|
|