Welcome to the MacNN Forums.

moki · Jun 22, 2001, 01:42 AM

In the course of working on my product, Snapz Pro X, I've been having a hard look at how the OS X window server works. Under MacOS 8/9, each window has a "region" that specifies the window's shape. The windows aren't buffered in any way, so when they need to be redrawn, the application that created the window has to manually redraw its contents.

Under OS X, each window has an offscreen buffer that contains not only the window's contents, but also an alpha mask that determines what parts of the window are visible, and their transparency. Because each window has a buffer, the application doesn't need to redraw a window's contents unless it has changed something in a window. You can think of it this way: each window has a mini "screen" all to itself.

This has a number of implications. The first is that each window takes up a good bit of memory when it is opened, regardless of whether it contains anything or not. A 800x600 window in Millions of colors would take up about 1.9mb of RAM. Obviously this causes OS X to use up memory at a faster clip than OS 8/9 (the same window under OS 8/9 would use up less than 100 bytes of memory -- no matter how large the window is, it doesn't use up any more memory for all practical purposes).

This also adds an indirection: the window's buffer needs to be drawn into before the window itself can be drawn to the screen. Under OS 8/9, each window just draws directly to the screen -- obviously this extra indirection means that things must be drawn twice before you see them: once to the window's buffer, another time to composite the window on the screen.

The alpha mask of the window's buffer is also used to define the window's shape. Where there is no alpha mask, the window is transparent, and clicks pass through it. Where there is an alpha mask, parts of the window shows through (how much shows through depends on the alpha mask). This certainly should allow for some interesting window shapes, to say the least. This is why you can have a perfectly round, semi-transparent floating clock window, for instance. The clock's frame is just a TIFF file with an alpha mask that defines the shape of the window.

When the user clicks on something, OS X first checks the bounding rectangle of each window (from the frontmost window to the backmost window) to see if that click happened inside of the window. If the click was indeed inside of the window's rectangle, it then checks the alpha mask of the window's buffer to see if the click happened on part of the window that it would consider "solid". (sidenote: there is a way to specify windows that are entirely see-through, yet clicks don't pass through -- I'm assuming that either a special "close to but not quite see-through" alpha value is used in this case, or there is an additional alpha mask or other mechanism that is used to control opacity)

In addition, the compositing mechanism that the OS X windowing system uses has some unique properties. To quote Apple's system overview page:

Traditional windowing systems use a "switch" model in which every pixel on a screen belongs entirely to one window (or the desktop). Because of this model, transitions are necessarily abrupt; when you close a window, for example, it disappears immediately. A layered compositing window system, on the other hand, is based on a "video mixer" model in which every pixel on the screen-particularly in the attributes of translucency and anti-aliasing-can be shared among windows in real time. This model allows for smooth transitions between the states of a graphical user interface, one of the distinctive characteristics of the Aqua experience.

I don't know what specific algorithm Apple is using to do this mixing, but what this means to me is that a window's contents aren't drawn to the screen, they are composited together with all of the other windows on the screen using each window's alpha mask. And indeed, if you scroll a document (by grabbing the scrollbar and dragging) that has no transparent windows over it, and then scroll the same document with a number of transparent windows over it, it doesn't seem to affect the scrolling speed as you might expect.

Sure, this lets you do some cool stuff with transparency -- but at what cost? How often are parts of the computer screen transparent? I don't understand the nuts and bolts of the windowing system well enough to say for sure, but it sure seems to me that this choice of a "video mixer" model could certainly be impacting performance in terms of screen drawing, especially if it isn't easily accelerated by traditional video cards. OpenGL bypasses all of this, and is one of the few parts of OS X that is faster than OS 9. Interesting indeed.

I still don't think that the imaging model is the sole reason for OS X's speed -- the event dispatching mechanism, and numerous other little parts of the OS that haven't been optimized yet are likely equal culprits. It does seem to me, though, that this design decision must have an impact. Hopefully I'll be able to uncover more information on the imaging model to fill in some missing pieces -- and anyone who is in the know, if I have any of this wrong, please do correct me. I'm interested in learning.

johnnylundy · Jun 22, 2001, 01:59 AM

Originally posted by moki:

Man, if a major developer can't find out from Apple how the system works, we are hosed. I betcha Microsoft knows - they always know because they can threaten to kill MS office and explorer.

They actually think it's a GOOD thing that a window doesn't close immediately? Jezus Crist on a pogo stick! What in the hell are they smoking now?

I wish you luck. Steve keeps saying the train is leaving the station, but it seems that if you try to get on it, the doors are locked.

moki · Jun 22, 2001, 02:08 AM

Originally posted by johnnylundy:

Man, if a major developer can't find out from Apple how the system works, we are hosed. I betcha Microsoft knows - they always know because they can threaten to kill MS office and explorer.

Well, in fairness, I'm not that major of a developer, and also, it's common practice that some APIs are information are kept private. I think also OS X is new enough that the information just isn't out there yet -- either that, or I haven't found it.

Jun 22, 2001, 02:19 AM

A lot of people have talked about Quartz; it's pro's and cons, and how it enables you to do some amazing things; there was a great thread over at the Appleinsider boards a while back on this. What I think people have come to the conclusion with Quartz, is that Apple made a decision, and a very ballsy one at that, to throw RAM and processors to the wind and do an all compositing display layer.

I don't know if anyone has clarified whether Quartz uses a video card to accelerate an opaque area on screen, but in my mind, this seems like common sense. To have some kind of opaque/alpha mode that can mix and match the two types on screen. Whether keeping track of all this negates the acceleration produced is a good question though. But it seems like only about 10 to 15 percent on screen really needs to be composited and the rest is completely opaque.

I am totally talking out my arse here so engineers go forth and shoot me down.

inta

frawgz · Jun 22, 2001, 02:59 AM

Originally posted by <intastella>:
I don't know if anyone has clarified whether Quartz uses a video card to accelerate an opaque area on screen, but in my mind, this seems like common sense. To have some kind of opaque/alpha mode that can mix and match the two types on screen. Whether keeping track of all this negates the acceleration produced is a good question though. But it seems like only about 10 to 15 percent on screen really needs to be composited and the rest is completely opaque.

I wouldn't know anything about this either, but I read that every window/layer in Quartz is arbitrarily transparent, which probably means that even though a layer is completely opaque, it is still treated by the algorithm as if it's not. As such, it still must go through the paces, so being "opaque" wouldn't help speed at all. I'm sure somebody can clarify..

moki · Jun 22, 2001, 05:42 AM

Originally posted by frawgz:


I wouldn't know anything about this either, but I read that every window/layer in Quartz is arbitrarily transparent, which probably means that even though a layer is completely opaque, it is still treated by the algorithm as if it's not. As such, it still must go through the paces, so being "opaque" wouldn't help speed at all. I'm sure somebody can clarify..

hrm, see, if that's the case, then the OS should slow down rather quickly as more windows are opened, which doesn't seem to happen. The algorithm must be a bit more sophisticated than that.

Also I do know that is possible programatically to make a window that is entirely opaque. In such windows, the alpha channel is ignored.

I'm looking at all of the transparent window title bars on my screen, and I'm wondering how much CPU it must churn merely to composite them.

GnOm · Jun 22, 2001, 05:55 AM

Originally posted by johnnylundy:
They actually think it's a GOOD thing that a window doesn't close immediately?

i�m no developer but wasn�t there always some "eye candy" to how windows open and close? Those spinning rectangles and such...
To me it�s not annoying to fade things out if done at a reasonable speed (smooth and just about noticable), but it would be if they�d fade things in (wich is fortunately not the case, neither Menues nor Windows do fade in).

cu

theolein · Jun 22, 2001, 09:19 AM

This means one thing mainly, uuhh, actually two: Those with video cards with lots of memory are a LOT better off, and, the event mechanism is done in a dumb way. Pardon me Apple. This is what I've been yacking about, with events being passed from one object (each window's model is an object) to another until it finds the right one. Am I wrong here? Whereas having a listener for the screen that always knows which window is where and dispathches the event off to th handler for that window would be better. Also this explains the uneven effect of window sizing and scrolling and window draging to a certain extent: You drag, it draws to the buffer, gets composited , it gets drawn. This IS very similar to double buffering in Swing, which slows things down a lot. I know there are algorithms to only update the parts of the screen that have been changed, but Apple doesn't seem to use them very well, if at all. I also think I now know why Aplpe is using a seperate dev tree for 10.1. I would imagine that they are having to change some parts of the event/drawing model.

krove · Jun 22, 2001, 11:14 AM

People have often been comparing the speed of the OS X GUI to that of Win2000, but having used Win2000 yesterday, I found that its windows are not double-buffered. Move windows really fast over one another and you're left with white space for just a moment until it is redrawn. I don't know about XP, but this speed comparison to windows (resize, etc) doesn't seem valid anymore because the windowing is done so much differently. As moki said, double buffering requires two writes for the same window.

mudmonkey · Jun 22, 2001, 11:44 AM

hrm, see, if that's the case, then the OS should slow down rather quickly as more windows are opened, which doesn't seem to happen. The algorithm must be a bit more sophisticated than that.

Also I do know that is possible programatically to make a window that is entirely opaque. In such windows, the alpha channel is ignored.

I'm looking at all of the transparent window title bars on my screen, and I'm wondering how much CPU it must churn merely to composite them.

Interesting Topic!

I don't think that the CPU is probably "churning" on the title bars for the mere fact of the double-buffered nature. The visible screen has been rendered, and it does not need to recalculate the transparencies (even when you move the top window).

Notice (like you said) how more windows don't slow things down. I can drag a window with all applications (other than topmost) hidden and it is no faster than when all applications (with 20 other windows) are visible.

Once the screen is rendered, moving a window (since it is all buffered) means not having to recalculate "the background" (everything behind it). Hence, we get the nice smooth move (without trying to redraw the contents of the background windows--it has all been rendered). We have to remember that it is a layered system.

I'm actually amazed (you have stated similarly) at how fast it does what it does. I remember people complaining about the rects when you would open/close windows on a Fat Mac and how they seemed slow...

I tend to think that most of the fault for slow window resizing is due to the particular applications not aqua/quartz. Omniweb is terribly slow at rendering pages and when you resize a window, it is trying to re-render in real time, draw it to the window buffer (and quartz then slams it out to the screen). So, there is some slowdown there, but, I have a feeling that if one were able to see bottlenecks the majority would be in Omniweb's slow rendering. Open a JPEG in Preview.app and resize the window and it is instantaneous since nothing the app does needs to be recalculated. That is pure, unadulterated quartz without the application having to figure out what to display.

Developer · Jun 22, 2001, 11:55 AM

Ok, there is obviously some kind of misunderstanding here. To make it clear:

Buffered windows are faster than unbuffered windows!

Blitting the offscreen buffer to the screen is actually rather fast, while directly drawing to the screen is slow, because QuickDraw has to constrain drawing to the update region, shield the cursor etc. When drawing into an offscreen buffer, you just draw into a rectangle which is much faster. That's why some apps in OS 9 already did their drawing into an offscreen buffer. This has been documented by Apple many years ago.
Many times the window buffer saves window draws. For example while window dragging or when activating a background window.

Of course applications that already did draw into a buffer under OS 9 now have to be updated to use the window buffer instead. Otherwise you'd do the blitting unnecessarily twice. Applications that must prepare their window content in an offscreen buffer anyway (QuickTime Player or games for example) can use a window in 'retained mode' which is unbuffered if unobscured in the front, bufferd otherwise.

Of course the speed advantage of buffered windows comes at the price of memory usage. But comparing todays memory prices to those of fast G4s, Apple made exactly the right decision.

Moki:
What problem do you have with the event dispatching mechanism? Carbon Events are much better than the classic event polling mechanism.

Developer

--
WITFM?

foobars · Jun 22, 2001, 12:25 PM

If you have the developer tools, take a look at the Quartz Debug app. Check the "Flash screen updates" button and watch the magic. I suggest you also click "no delay after flash"...

This app proves that OSX's windows are by default opaque because when you drag a normal (say, Finder) window, only the boarder and shadow are redrawn- the contents aren't redrawn at all. If you move a transparent terminal window, you'll notice that the entire window is redrawn.

The really cool thing about how Quartz handles screen updates (especially for transparency) is that things are only redrawn when they absolutely HAVE to be redrawn. If you run a "top" command in a transparent terminal you'll notice it only redraws the parts of the windows that need to be redrawn- if the bottom few lines of the window are the same, the won't be redrawn EVEN if the window is transparent. Thus Quartz has a VERY INTELLEGENT system of redrawing: it knows where every window is, if the window is opaque or not, what is inside every window, and how it will effect the appearance of windows on top of it when it is redrawn. This is quite possibly the most advanced and intelligent window manager ever.

Another example: take a terminal window, run top, then place a window partially over the termainal, so that the title bar of the window you are placing over the terminal restes half over the terminal and half over the desktop. Now click on the desktop so that the title bar of the window you just placed over the terminal goes transparent. With Quartz Debug, you'll see that only the part of the title bar that is over the refreshing terminal is redrawn! Drop shadows and transparencies are therefore only redrawn when the window moves, becomes inactive, or something underneith it updates. Furthermore, only the part of the window whose appearance will change is redrawn.

Also, click on the "show window list" to see all the windows Quartz is handleing at any given time- the third column is how much memory every window takes up in KB.

Quartz is a hog when it comes to memory but is BY FAR the most advanced window manager out there!

moki · Jun 22, 2001, 12:32 PM

Originally posted by Developer:

Blitting the offscreen buffer to the screen is actually rather fast, while directly drawing to the screen is slow, because QuickDraw has to constrain drawing to the update region, shield the cursor etc. When drawing into an offscreen buffer, you just draw into a rectangle which is much faster. That's why some apps in OS 9 already did their drawing into an offscreen buffer. This has been documented by Apple many years ago.

Actually, the main reason that drawing into an offscreen buffer is faster is that accessing VRAM over PCI (or now AGP) is significantly slower than accessing main memory, and VRAM is uncachable. However, once you've drawn to your offscreen buffer, you still have to draw again to get it to the screen. You don't usually gain any speed by doing this (except some situations where caching then blitting is more efficient); you do gain a much nicer imaging model, and flicker-free updates, though.

Even then it's only half the battle -- how the OS chooses to composite your window onto the screen is extremely important. If OS X was just doing a straight blit from the window's buffer to the screen, it might be a bit speedier. As it stands now, it looks like OS X's 'video mixer' imaging model is a bit of an impedement.

Here's an example of when it can suck. I have a marquee selection in my application. I draw the frame of the rectangle to my window, which actually just draws to the window's offscreen buffer. OS X then draws these changes to the screen -- we're already slower than it would be to draw directly to the on-screen port, but it isn't so bad (yet).

It turns out that OS X will blit not just the frame of the rectangle (which is all I actually painted), but it will blit the entire rectangle. So if I draw a 800x800 frame of a rectangle, it ends up blitting all 640,000 pixels that the rectangle encloses, not just the pixels around the edges that I actually painted.

I understand why they'd do this, but it isn't very cool. I draw something small like a rectangle, and I end up not only double-buffering the drawing, but also copying over far more memory than I need to. In addition, if the cursor is in the center of this rectangle on the screen, it will flicker when my rectangle's frame is drawn -- and mind you, the area that the cursor is over hasn't changed one bit. This is not pleasant.

Originally posted by Developer:
Of course applications that already did draw into a buffer under OS 9 now have to be updated to use the window buffer instead. Otherwise you'd do the blitting unnecessarily twice. Applications that must prepare their window content in an offscreen buffer anyway (QuickTime Player or games for example) can use a window in 'retained mode' which is unbuffered if unobscured in the front, bufferd otherwise.

Retained mode is broken. The constant isn't in the header file anymore, and if you do use that magic constant, there are some severe issues. Most games are more likely to go full screen using the routines in CGDirectDisplays(), which is fine for full screen games, but running in a window is important, too.

Originally posted by Developer:
Of course the speed advantage of buffered windows comes at the price of memory usage. But comparing todays memory prices to those of fast G4s, Apple made exactly the right decision.

See my above example; they made a clean design choice, but it seems to me to be a poor real-world imaging model. The issue for me isn't memory -- I'd be fine with OS X gobbling RAM if it were a speed demon. The issue is it uses a lot of memory and still is much slower than it should be. Usually the tradeoff is efficient RAM usage vs. efficient speed, not both.

Originally posted by Developer:
Moki:
What problem do you have with the event dispatching mechanism? Carbon Events are much better than the classic event polling mechanism.

I like the carbon event model quite a bit -- it's definitely much better than the standard Classic WNE spin, and CarbonEvents are a joy to work with. However, the way the event model works is different (though linked) to how it performs. All of the possible listeners/handlers, and layers through which simple user events like typing or moving the mouse over menus seem to me (empirically) to be slowing things down.

[ 06-22-2001: Message edited by: moki ]

moki · Jun 22, 2001, 12:51 PM

Originally posted by foobars:
Quartz is a hog when it comes to memory but is BY FAR the most advanced window manager out there!

The things you describe such as updating only areas of the screen that have actually changed are pretty standard-fare as far as window managers go.

Where Quartz's core rendering is more advanced than other window managers is possibly the way it actually does its compositing -- but it also seems like it may be a bit of a bottleneck.

Developer · Jun 22, 2001, 01:38 PM

Actually, the main reason that drawing into an offscreen buffer is faster is that accessing VRAM over PCI (or now AGP) is significantly slower than accessing main memory, and VRAM is uncachable.

I'm not sure I understand this. I was under the impression, that creating a GWorld in VRAM is the fastest way to draw. Actual drawing is done by the GPU and blitting the GWorld from VRAM on screen is extremely fast.

However, once you've drawn to your offscreen buffer, you still have to draw again to get it to the screen. You don't gain any speed by doing this; you do gain a much nicer imaging model, and flicker-free updates, though.

Drawing offscreen first is faster. See this old develop article on why:

Drawing in GWorlds for Speed

I can understand why it is not optimal for your marquee selection though. If it is of rectangular shape, couldn't you make sure the update region contains the frame only? You could remove the inside of the rectangle from the update region, should FameRect() (or what ever you call to draw the marquee) add a (filled) rect to the update region.
I don't think it would speed up things, but it could solve your cursor flicker problem (it shouldn't flicker anyway though - a bug in the graphics card driver?)

Retained mode is broken.

It'll come back.

However, the way the event model works is different (though linked) to how it performs. All of the possible listeners/handlers, and layers through which simple user events like typing or moving the mouse over menus seem to me (empirically) to be slowing things down.

At least in theory it should be faster, since you only install handlers for events you're interested in instead of receiving each and every event and ignoring those you don't need. I don't think the event model is the reason for the slow down you've seen.
On the other hand, I really don't know why OS X is so slow. I've yet to hear a really convincing explanation.

Developer

moki · Jun 22, 2001, 01:57 PM

I'm not sure I understand this. I was under the impression, that creating a GWorld in VRAM is the fastest way to draw. Actual drawing is done by the GPU and blitting the GWorld from VRAM on screen is extremely fast.

Memory latency is typically the bottleneck when you're doing anything that involves moving a lot of data around. Putting a GWorld in VRAM is only fast if the video card alone is doing the drawing. If the main CPU has to get involved at all (as is the case with Quartz and parts of QuickDraw, too), keeping the GWorld in VRAM is actually significantly slower.

Reading from VRAM is about 2x slower than writing to it. Accessing VRAM in any manner (reading/writing) is significantly slower than accessing main memory, and in addition, VRAM isn't caches, so you can't keep data you're working with in the computer's ample (and fast) L2 cache.

Drawing offscreen first is faster. See this old develop article on why:

Drawing in GWorlds for Speed

Yep, I've read that old develop article, and certainly some of the points are valid, but just saying "drawing offscreen first is faster" really isn't true. It's only faster if you're doing a significant amount of drawing all at one time, which isn't the case for a lot of UI drawing. The concept of "instance drawing" that NeXT had was a good one, and would help here.

I can understand why it is not optimal for your marquee selection though. If it is of rectangular shape, couldn't you make sure the update region contains the frame only? You could remove the inside of the rectangle from the update region, should FameRect() (or what ever you call to draw the marquee) add a (filled) rect to the update region.

Sure, I considered doing that, but my shape will actually end up being an arbitrary region, so clearing out the dirty Rect in the middle won't help. Besides, the way Quartz seems to work is it accumulates drawing dirty rects, and combines them -- so even if I did draw in a smarter manner, it'll end up drawing the whole rectangle over anyway.

I don't think it would speed up things, but it could solve your cursor flicker problem (it shouldn't flicker anyway though - a bug in the graphics card driver?)

I don't know -- might be a bug in the driver. It only seems to happen when I have a color cursor set (and for this reason, I may do without them... sigh).

It'll come back. (re: retained mode)

Maybe... maybe not? Who knows? Apple hasn't said anything either way that I'm aware of.

At least in theory it should be faster, since you only install handlers for events you're interested in instead of receiving each and every event and ignoring those you don't need. I don't think the event model is the reason for the slow down you've seen.

In theory, no, it should be ok in terms of speed. But the implementation doesn't seem to bear this out -- the time it takes for the window server to get the event to your application's various handlers is slothish. I don't get it either.

On the other hand, I really don't know why OS X is so slow. I've yet to hear a really convincing explanation.

I thought my explanation was fairly convincing (ok, maybe semi?

); the cool but CPU-intensive imaging model combined with a number of inefficiencies throughout all layers of the OS. They got it working, but they didn't optimize it much.

Developer · Jun 22, 2001, 03:02 PM

Just saying "drawing offscreen first is faster" really isn't true. It's only faster if you're doing a significant amount of drawing all at one time, which isn't the case for a lot of UI drawing.

Well, it was significantly faster for my app. I could watch it drawing every item versus 'plop - it's there'. I dare to say it is faster for the typical drawing an app has to do.
It might be very inefficient to blit almost the while window for drawing just those few pixels of a marquee selection. But maybe you should just accept the facts and switch to something that works better with Quartz. For example something like the Finder does it. A transparent region with a (non animating) frame. Or wait for retained mode.

Maybe... maybe not? Who knows? Apple hasn't said anything either way that I'm aware of. (if retained mode will come back)

Back in April Eric Schlegel said it will come back in a future release.

The time it takes for the window server to get the event to your application's various handlers is slothish.

That would explain why it takes forever to activate a window that comes to front, even though the window content doesn't have to be redrawn due to buffering (in most cases).

Let's hope they get everything implemented by 10.1, so that we can finally see it optimized by 11 (or 10.5?).

Developer

--
"A man can dream though, a man can dream." [Professor Hubert Farnsworth]

moki · Jun 22, 2001, 03:14 PM

Well, it was significantly faster for my app. I could watch it drawing every item versus 'plop - it's there'. I dare to say it is faster for the typical drawing an app has to do.

Well, sure, it looks faster because it all gets drawn to the screen in one fell swoop. However all of those items that you used to watch drawing item by item on screen are still being drawn item by item offscreen.

In actual profiling tests, you may find that this kind of buffered drawing may or may not be temporally faster than just drawing directly to the screen. Whether it is faster or not will depend mostly on how much drawing you do at once.

It might be very inefficient to blit almost the while window for drawing just those few pixels of a marquee selection. But maybe you should just accept the facts and switch to something that works better with Quartz. For example something like the Finder does it. A transparent region with a (non animating) frame. Or wait for retained mode.

Sure, I could do a non-animating region the way the Finder does it, but I don't want to dim the contents of what I'm selecting. Also I'd still be drawing the pixels for the entire rectangle either way, so there's no real difference in terms of how many bits need to be pushed around.

Ah well. I'm continuing to investitate the OS X window compositing -- it looks like 16 bit window buffers also have a seperate 8 bit planar alpha mask -- so in 16 bit mode, you're really storing 24 bits of information for each pixel in a window's buffer.

Let's hope they get everything implemented by 10.1, so that we can finally see it optimized by 11 (or 10.5?).

Preach on -- I agree! OS X needs a serious boost in speed on existing hardware, or it's not going to be pretty.

Developer · Jun 22, 2001, 04:09 PM

Well, sure, it looks faster because it all gets drawn to the screen in one fell swoop.

It doesn't only look faster, it is faster. Of course - as you know - the speed advantage while drawing offscreen must outweigh the blitting to the screen. But for the average app, that doesn't animate any window content (i. e. draw everything at once), this is the case.

Sure, I could do a non-animating region the way the Finder does it, but I don't want to dim the contents of what I'm selecting. Also I'd still be drawing the pixels for the entire rectangle either way, so there's no real difference in terms of how many bits need to be pushed around.

You'd do it only once, not x times a second. Not wanting to dim the selection is a valid point. What about dimming everything but the selected content?
I just wanted to give an example, that sometimes it might be better to completely rethink what your doing than trying to optimize what worked best on OS 9.

Developer

--
"A man can dream though, a man can dream." [Professor Hubert Farnsworth]

'kberg · Jun 22, 2001, 04:09 PM

I haven't spent any appreciable amount of time digging around in the developers resources for X yet, so I offer this only as one possible work-around for your example.

Assuming that the drawing to your windows buffer and aqua's blitting is done asynchronously, is there a way to be notified when a blit has occured?

Otherwise it's as simple as partitioning your frame into linear segments and then cycling the drawing of each segment per blit. So instead of drawing a whole frame per screen update, draw four individual lines over the course of four screen updates?

Not sure how it would look on slower machines...

moki · Jun 22, 2001, 04:38 PM

Originally posted by Developer:


You'd do it only once, not x times a second. Not wanting to dim the selection is a valid point. What about dimming everything but the selected content?
I just wanted to give an example, that sometimes it might be better to completely rethink what your doing than trying to optimize what worked best on OS 9.

Oh trust me, I've had to rethink my entire approach to my app under OS X. Porting it to OS X isn't an option, it needed to be entirely rewritten and re-thought.

As for the "dim everything but selected content" -- yeah, that's something that I should would be really cool too. However, in this instance, that would result in drawing more than the reverse (ie, I'd be dimming most of the screen).

Anyway, I do appreciate your input and comments.

crayz · Jun 22, 2001, 04:59 PM

OK, OK, this is all very interesting, but I think what everyone here wants to know is:
When do we get EV Carbon?

PS: Go Rochester, NY!
PPS: might also wanna check out Macintoshian Achaia over on Ars. There are some informed folks over there too.

[ 06-22-2001: Message edited by: crayz ]

moki · Jun 22, 2001, 05:13 PM

Originally posted by crayz:
OK, OK, this is all very interesting, but I think what everyone here wants to know is:
When do we get EV Carbon?

EV Nova is almost in beta, and is carbonized. The engine is quite a bit more advanced than Escape Velocity's -- I think you'll like it:

http://www.AmbrosiaSW.com/news/upcom...betashots.html

Big Mac · Jun 23, 2001, 07:38 AM

Would anyone care to guess when (or if) we will actually see the kinds of optimizations discussed in this thread? Are these optimizations actually doable within the confines of Apple's development resources? It's really great to discuss these issues, but talk means little if nothing is actually achieved. Have the findings posted in this thread been forwarded to Apple?

OS X is the greatest OS around, and it will only get better. However, Apple really needs help from all of you software geniuses, so I'm asking all of you not to be shy and to impart your wisdom! I myself find OS X mostly acceptable in terms of speed on my iBook 2000, and I'm more concerned with the quantity and quality of native applications. Yet, if we're hoping to maximize OS X's potential, we cannot leave any stone unturned.

theolein · Jun 23, 2001, 10:59 AM

Originally posted by moki:


EV Nova is almost in beta, and is carbonized. The engine is quite a bit more advanced than Escape Velocity's -- I think you'll like it:

http://www.AmbrosiaSW.com/news/upcom...betashots.html

Andrew, I just looked at the movies of EV Nova and it looks sh*t hot! Can I play, can I play, daddy I wanna play now!!!! Damn EV overide was the only game I actually ever liked on the mac. It had a universe and *shock* a plot and a story and it was cool and these new graphics are so damn cool. Do you know btw if the guy who did frozen heart ever got elected to office. He seemed like an amazing guy. Are there gonna be possibilities to make mods again? Know what would also be cool? being able to place movies in the planet scenes. now that would be cool, just think about it! You could fill a whole cd with litle videos of planets and start up a whole new cult. Can I come and work for you Andrew? I'll make coffee, I'll polish your code warrior, I'll learn C properly this time.

And... this time I'll even buy EV so that the little captain wont trash my scenes all the time

King Kong · Jun 23, 2001, 03:40 PM

All this would explain why OS X gets noticeably slower as screeensize gets bigger. Most of my machines have large displays.

Some have dual displays and one even has a triple display. The more screen real estate you have the slower X gets. The difference is marked. (OS 9 doesn't show any noticeable slowdown with larger monitors or multiple displays....) Thanks to all the explanations above at least I understand why this is happening....

moki · Jun 23, 2001, 04:08 PM

Originally posted by Big Mac:
Would anyone care to guess when (or if) we will actually see the kinds of optimizations discussed in this thread? Are these optimizations actually doable within the confines of Apple's development resources? It's really great to discuss these issues, but talk means little if nothing is actually achieved. Have the findings posted in this thread been forwarded to Apple?

The engineers at Apple are in a far better position than I am to be able to profile exactly where the ineffeciencies are in OS X, and fix 'em. I'm just talking about the issues I've been able to find (and some speculation) based on plumbing through the OS casually while developing my app.

Apple doesn't really need our help identifying why OS X is slow -- they have the source code, the tools, and the engineering talent/know-how to make it happen. It's really a matter of them focusing their resources on the issue.

Given that OS X is still missing some features (DVD, etc.), it's possible that they won't be devoting a whole lot of their resources to optimization yet. I hope this isn't true.

I'm pleased as punch about OS X, except for the speed issue, and a few very minor interface issues. Give me some more speed (well, and better developer documentation), and I'll have to work a bit harder to find something to bitch about

moki · Jun 23, 2001, 04:13 PM

Originally posted by Developer:


You'd do it only once, not x times a second. Not wanting to dim the selection is a valid point. What about dimming everything but the selected content?
I just wanted to give an example, that sometimes it might be better to completely rethink what your doing than trying to optimize what worked best on OS 9.

Just wanted to let you know -- I found a way to implement something very similar to what you're talking about, without the high overhead I thought I'd have to. It's very cool.

I still might put the marching ants around the borders, though, because I'm not sure the static selection is enough feedback to the user.

BuonRotto · Jun 23, 2001, 04:16 PM

Given that OS X is still missing some features (DVD, etc.), it's possible that they won't be devoting a whole lot of their resources to optimization yet. I hope this isn't true.

I'm pleased as punch about OS X, except for the speed issue, and a few very minor interface issues. Give me some more speed (well, and better developer documentation)�

That pretty much summarizes how a lot of people feel. I just wanted to recapitulate this sentiment.

Jun 23, 2001, 04:48 PM

Here is reply from The man at comp.sys.next.advocacy

<http://groups.google.com/groups?start=50&hl=en&safe=off&th=2febfa6e089e70a6 ,56&rnum=56&ic=1&selm=3B338D65.2AD1E73B%40earthlin k.net>

moki · Jun 23, 2001, 05:05 PM

Originally posted by <argod>:
Here is reply from The man at comp.sys.next.advocacy

<http://groups.google.com/groups?start=50&hl=en&safe=off&th=2febfa6e089e70a6 ,56&rnum=56&ic=1&selm=3B338D65.2AD1E73B%40earthlin k.net>

Very cool -- confirms a few of my suspicions -- that most of the drawing is limited by bus speeds (access to memory) because the CPU itself does the blending (thus the VRAM discussion from earlier), and that the slowdowns people are complaining about don't have too much to do with Quartz.

Of course, whether the drawing slowdown happens because it is CPU-bound or memory bus-bound is of little concern to users; they just want it faster. And also of course, the average user doesn't know or care what part of the OS is slowing down live window resizing, they just want it faster.

It all boils down to this: there's no one single part of OS X that's slowing it down -- don't look for a scapegoat. It's just a matter of Apple wanting to make OS X work first, and then go back and optimize it. The same holds true for OS X applications.

Until Apple and Apple developers take the time to profile and optimize their code, you aren't going to see any huge increases in speed no matter what imaging model is used.

JCS · Jun 24, 2001, 12:42 AM

It all boils down to this: there's no one single part of OS X that's slowing it down

Yeah, *everything's* slow!

Seriously though, I think it's pretty clear that there *is* "one single thing" that contributes to the average user's perception of speed more than any other part of OS X: the display system.

I know people have been trying to get the name "Quartz" out of the hot seat, citing the negligible CPU cost of the various Quartz-specific calculations (blending, bezier curves, whatever), but the name "Quartz" covers more than just the algorithms. It's the whole architecture: the buffering, trips to and from main RAM and VRAM, the whole nine yards.

The design of Quartz has a kind of ripple effect on the whole system--especially the RAM usage. Once the swapfile gets involved, it gets ugly very fast. And as Andrew has experienced, it doesn't take much to use gobs and gobs of RAM when every window wants its own true-color, alpha-masked off-screen buffer. Even something as seemingly lightweight as a text editor suddenly becomes a memory stress test in OS X. Having 30 files open in BBEdit shouldn't send my dual G4/450 with 256MB RAM into a 45 second frenzy of disk thrashing when I un-hide the app. But just add up the buffered pixels and it all starts to make sense. "Oh, it's paging in the 15MB of window buffers that got paged-out while the app was hidden and I was using my 6 Terminal windows and 5 OmniWeb windows." 256MB of RAM goes a *lot* farther in OS 9, let me tell you.

Yes, there are plenty of other culprits: poorly written/ported apps that compound the display problems with needless drawing, redundant buffering, and dumb polling, scheduler inadequacies (no priority inversion?), and the expected lack of optimization in a x.0 release. And if the display system was as fast and resource-stingy as OS 9's much less ambitious architecture, then I'm sure the next worst offender would bubble to the top of the list. But as things stand now, the display system takes the cake. It's head and shoulders above the rest as far as contributing to the perceived slowness of OS X, IMO. (The exception being the Finder, which is so bad and so important that it raises the level of "poorly written app" to heights that rival--and piggy-back on--the display system problems.)

As for the future, I think it's kind of bleak for at least a year or so. Completely changing the architecture would be a dumb move. Quartz is built for the future, and rightly so. But I can't think of anything that Apple can do to make it as fast as a prev-gen display system (like OS 9's) on existing hardware. I'm sure it'll get faster, and the changes to the other parts of OS X will also help a lot. But fundamental stumbling blocks for existing hardware like the enormous RAM usage seem intractable. What can Apple do? Send everyone more RAM? Switch back to direct-draw when memory gets tight? I don't have any good (software-based) ideas to make my 30 BBEdit windows less taxing on my current G4 when I've got a full work-load of apps and windows open.

The distant future is much brighter. Imagine Quartz moved entirely onto the video card: all the buffers, all the calculations, all sitting in 128MB of fast RAM soldered next to a powerful GPU (falling back to main memory only after the (hardware-compressed, a la GameCube, natch ;-) on-card RAM is full.) Raycer rumors are very enticing, but cooperation from nVidia or ATI is probably more realistic.

Anyway, by MWSF 2003, I'm sure this will all be a bad memory...