Comments on a new drawing architecture ----- Motivation -- The drawing architecture in Blender is a bit haphazard but for the most part works, however, the following motivations exist for designing a new architecture: o The existing design relies in a few but critical places on drawing to the front buffer, which is not commonly used in OpenGL and in some cases causes large slowdowns (ATI cards). A new system should *never* draw in the front buffer. o Drawing is not centralized and not stateful. Centralized meaning the aspects that contribute to drawing some particular region of the screen are not clearly locatable in one part of the code, which is a problem for refactoring the window system. Stateful is related, because drawing is not centralized and in particular often implicitly relies on what has already been drawn, there is no clear formula for redrawing a particular region of the screen if needed. For example, in many circumstances Blender does not correctly handle, and is not well adapted to handle, cases where the window system requests a redraw during a subloop (like borderselect). o Blender currently redraws at a high level of granularity, often times redrawing multiple entire subregions of the screen in response to an event, without real need. Although this is not done in response to most high performance events (such as manipulating a slider), it poses a problem for rewriting interface controls to not use event subloops, and may have cause some performance problems on machines. o Altogether the components that make up the drawing are very distributed and make the code base harder to understand and to extend than are desirable or feasible. A primary goal of a new system is to have a conceptually simpler external interface. Low-level architecture: The OpenGL interaction -- Our goal for the low-level interaction with OpenGL is to choose some standard behavior to require of the OpenGL subsystem, and find a way to implement this behavior on all (or as many as is reasonable) supported systems/cards. Additionally, this standard behavior must afford an efficient implementation of user interfaces. The particular problem that Blender has is that there is not an obvious, consistent method for redrawing a subregion of the OpenGL context. Most application use multiple OpenGL contexts, and can just redraw an individual context, however, Blender is designed and could not be modified without some amount of work to support this method. Thus, we need to establish some framework that allows us to efficiently implement redisplay of a particular subregion of the screen. We would like for this framework to ideally be efficiently abstracted over in the Ghost library. There are two obvious candidates for a primitive that fits these constraints. Copy-to-frontbuffer - The copy-to-frontbuffer method is based at supplying a function in ghost to efficiently copy a region of the backbuffer to the frontbuffer. Redisplay of a subregion is implemented by redrawing the region in the backbuffer and copying it to the frontbuffer. An important factor is that this copy must respect synchronization with the vertical retrace to avoid tearing artifacts. Generic Implementation: This functionality can be implemented using glCopyPixels, however, on some systems/card that does not have very good performance, and will not be synced to the video display. If there is an operating system call to sync to the vertical retrace that can be fixed, although it is not particularly clean or ideal. Note that multiple monitor support is an issue (although limited in applicability). OSX Implementation: On OSX the OpenGL subsystem supports swapping a subrectangle of the entire context, which is exactly what we want and is probably synced (or can be requested to be synced) to the video display. Win32 Implementation: No better implementation than generic is immediately apparent, and Win32 support for sync'ing to the vertical retrace is not known to be available. X11 Implementation: No better implementation than generic is immediately apparent, and support for sync'ing to the vertical retrace is not known to be available on all supported platforms. Analysis: The sync requirements seem to make the use of this method infeasible. Persistent-backbuffer - The persistent-backbuffer method is based at preserving a full copy of the correct display in the backbuffer. Redisplay of a subregion is implemented by redrawing the subregion in the backbuffer and calling the context's swapbuffer method to update the entire frontbuffer. Sync requirements are handled by the platforms swapbuffer routine. The main complexity is determining how to guarantee that the backbuffer still retains a full copy of the correct display after a call to swapbuffers. Preamble: There are two primary ways to implement swapbuffers, the first is to actually exchange the front- and back-buffers, to my knowledge this is always how swapbuffers works on, say, older SGIs. In an ideal exchange situation the back-buffer contents are identical to the old front-buffer contents following this exchange. This method is called swap-exchange. The second is to copy the entire backbuffer to the frontbuffer, leaving the backbuffer untouched, this method has the delightful consequence of satisfying the persistent-backbuffer requirement with no additional work. OpenGL on OSX appears to always satisfy this requirement all the time. This method is called swap-copy. Side note: this is actually the nicest general behavior for a swapbuffers system to have, it was unused on SGI because the other method is theoretically faster. However, the side effect of the back-buffer becoming the old front-buffer is generally less useful than of the back-buffer just staying the same. The method that OpenGL requires is, sadly, neither, nor does it provide a mechanism for querying what method the OpenGL implementation is using. In particular, OpenGL states that following a swapbuffers, the contents of the backbuffer are undefined. Generic Implementation: As mentioned, if the OpenGL system can be confirmed to follow swap-copy then the requirements are satisfied with no additional support. If the OpenGL system can be confirmed to follow swap-exchange, then one way to enforce the persistent-backbuffer requirement is to, following a swapbuffers, copy the entire front-buffer to the back-buffer, or to redraw the entire back-buffer. This process can be optimized by noting that in the absence of window system damage to the front-buffer, the new back-buffer is the same as old front-buffer, which is the same as the back-buffer two swaps previously. Thus the state of the back- and front-buffers can be tracked, and typically only a small portion of the back- buffer needs to be fixed with a copy or a redraw following a swap. If the OpenGL system cannot be confirmed to follow swap-exchange, then copying the entire front-buffer to the back-buffer, or redrawing the entire back-buffer satisfy the given requirement, but this process cannot be further optimized and is likely to thus be prohibitively slow. An alternate method would be to preserve a copy of the back-buffer in memory using glReadPixels and glDrawPixels to restore, this has the advantage that following a partial redraw, only a small region of the back-buffer needs to be saved using glReadPixels. However, following a swapbuffers a glDrawPixels of the entire screen needs to be performed. This is not likely to be particularly speedy. The short answer is there is no good generic implementation if the system cannot be confirmed to follow either of the two swap methods. It is an interesting question if any such system exists. In general a compliant OpenGL implementation could switch swap-methods every time, or perhaps in demand to dynamically changing requirements from the OS or window system, but I do not know that any such system exists (it seems somewhat unlikely). If we assumed that any system would follow one of the two methods above, we could draw and swap the screen a few times and ascertain which method it was using, and dynamically choose the correct generic implementation. OSX Implementation: As mentioned, OSX seems to always have swap-copy behavior by default, and an explicit flag is available to request an OpenGL context with this behavior. This does not guarantee that all systems will have such a context. It seems likely however. Win32 Implementation: Windows provides a flag to *request* swap-copy behavior when obtaining an OpenGL context, but this is just a hint, and it is not clear if it is possible to confirm that the context returned actually does follow swap-copy behavior. Win32 OpenGL implementations also prevalently support the WGL_ARB_buffer_region extension, which can be used to efficiently implement the persistent-backbuffer behavior. X11 Implementation: No better implementation than generic, additional OpenGL queries to determine swap behavior, or common extensions supporting implementation are known to exist. Analysis: Despite the ambiguity of the swap method, it seems likely that in practice most systems could use a generic method, and on system that support swap-copy this method would be very high performance. Additionally, it appears to be the case that swap-copy behavior is preferred on modern graphics cards. High-level architecture -- We leave the tender world of the swapbuffer for a moment to consider what high-level reforms to the interface/drawing architecture are needed. The main task is getting rid of the frontbuffer drawing code, there are currently 44 reference to glDrawBuffer(GL_FRONT), and slightly but not too much more than that in indirect calls, for example to the xor line drawing or to functions that will update text on headers. Each of these needs to be converted to back-buffer drawing using the new scheme of choice. In order to accomplish some other goals at the same time, I propose to do this by converting making all drawing stateful and centralizing all of the drawing code. This has some implications for the way the event subloops are written, but does not require the elimination of all of them (a moderate sized and potentially error prone task). However, by modifying the drawing code in this way it will become easier to eliminate these subloops in the future, in particular the drawing code should not need to be changed, and part of the state that is in the event subloop will already have been moved into other structures. The main idea is that callbacks, event handlers, etc. will request update of regions of the screen, which would typically be handled in an event based fashion, but the event subloops can make a direct call to the function that would handle drawing these regions of the screen when necessary. This drawing function will serve as the centralized drawing function, no drawing should ever be performed outside of the call tree of this function. This will also a eliminate a lot of the messiness associated with curarea, areawinset, and all those other friendly fiends. I have a sketch of a framework for this system here. It also makes some accommodations for handling backing store for menus (saveunder in the current codebase). More on this to come pending experimentation... Other note -- One final note is, once the drawing code has been moved to this new system, it is actually not that hard anymore to switch to a multiple context system. The main work really just becomes extending GHOST to support multiple contexts, and modifying the windowing code to deal with resizing contexts, etc. This may turn out to be the nicest way to solve our problems with swapbuffers, although it still would be nice to have support for a real subbuffer copy, in order to support partial redraw (mostly for interface components), but also for things like vertex selection for mesh editing.