Comments on a new drawing architecture
-----

Motivation
--
The drawing architecture in Blender is a bit haphazard but
for the most part works, however, the following motivations
exist for designing a new architecture:

 o The existing design relies in a few but critical places on
   drawing to the front buffer, which is not commonly used in
   OpenGL and in some cases causes large slowdowns (ATI cards).
   A new system should *never* draw in the front buffer.
   
 o Drawing is not centralized and not stateful. Centralized meaning
   the aspects that contribute to drawing some particular region
   of the screen are not clearly locatable in one part of the code,
   which is a problem for refactoring the window system.

   Stateful is related, because drawing is not centralized and in
   particular often implicitly relies on what has already been drawn,
   there is no clear formula for redrawing a particular region of the
   screen if needed. For example, in many circumstances Blender does
   not correctly handle, and is not well adapted to handle, cases where
   the window system requests a redraw during a subloop (like borderselect).

 o Blender currently redraws at a high level of granularity, often
   times redrawing multiple entire subregions of the screen in response
   to an event, without real need. Although this is not done in response
   to most high performance events (such as manipulating a slider), it
   poses a problem for rewriting interface controls to not use event
   subloops, and may have cause some performance problems on machines.

 o Altogether the components that make up the drawing are very distributed
   and make the code base harder to understand and to extend than are
   desirable or feasible. A primary goal of a new system is to have a
   conceptually simpler external interface.


Low-level architecture: The OpenGL interaction
--
Our goal for the low-level interaction with OpenGL is to choose some
standard behavior to require of the OpenGL subsystem, and find a way
to implement this behavior on all (or as many as is reasonable) supported
systems/cards. Additionally, this standard behavior must afford an
efficient implementation of user interfaces.

The particular problem that Blender has is that there is not an obvious,
consistent method for redrawing a subregion of the OpenGL context. Most
application use multiple OpenGL contexts, and can just redraw an individual
context, however, Blender is designed and could not be modified without
some amount of work to support this method.

Thus, we need to establish some framework that allows us to efficiently
implement redisplay of a particular subregion of the screen. We would
like for this framework to ideally be efficiently abstracted over in the
Ghost library. There are two obvious candidates for a primitive that fits
these constraints.

Copy-to-frontbuffer
-
The copy-to-frontbuffer method is based at supplying a function in ghost
to efficiently copy a region of the backbuffer to the frontbuffer. Redisplay
of a subregion is implemented by redrawing the region in the backbuffer and
copying it to the frontbuffer. An important factor is that this copy must
respect synchronization with the vertical retrace to avoid tearing artifacts.

Generic Implementation:
  This functionality can be implemented using glCopyPixels, however, on some
  systems/card that does not have very good performance, and will not be synced
  to the video display. If there is an operating system call to sync to the
  vertical retrace that can be fixed, although it is not particularly clean or
  ideal. Note that multiple monitor support is an issue (although limited in
  applicability).
  
OSX Implementation:
  On OSX the OpenGL subsystem supports swapping a subrectangle of the entire
  context, which is exactly what we want and is probably synced (or can be
  requested to be synced) to the video display.
  
Win32 Implementation:
  No better implementation than generic is immediately apparent, and Win32
  support for sync'ing to the vertical retrace is not known to be available.
  
X11 Implementation:
  No better implementation than generic is immediately apparent, and support
  for sync'ing to the vertical retrace is not known to be available on all
  supported platforms.

Analysis:
  The sync requirements seem to make the use of this method infeasible.


Persistent-backbuffer
-
The persistent-backbuffer method is based at preserving a full copy of the
correct display in the backbuffer. Redisplay of a subregion is implemented by
redrawing the subregion in the backbuffer and calling the context's swapbuffer
method to update the entire frontbuffer. Sync requirements are handled by the
platforms swapbuffer routine. The main complexity is determining how to
guarantee that the backbuffer still retains a full copy of the correct display
after a call to swapbuffers.

Preamble:
  There are two primary ways to implement swapbuffers, the first is to actually
  exchange the front- and back-buffers, to my knowledge this is always how
  swapbuffers works on, say, older SGIs. In an ideal exchange situation the
  back-buffer contents are identical to the old front-buffer contents following
  this exchange. This method is called swap-exchange.

  The second is to copy the entire backbuffer to the frontbuffer, leaving the
  backbuffer untouched, this method has the delightful consequence of satisfying
  the persistent-backbuffer requirement with no additional work. OpenGL on OSX
  appears to always satisfy this requirement all the time. This method is called
  swap-copy.

  Side note: this is actually the nicest general behavior for a swapbuffers system
  to have, it was unused on SGI because the other method is theoretically faster.
  However, the side effect of the back-buffer becoming the old front-buffer is generally
  less useful than of the back-buffer just staying the same.

  The method that OpenGL requires is, sadly, neither, nor does it provide a
  mechanism for querying what method the OpenGL implementation is using. In particular,
  OpenGL states that following a swapbuffers, the contents of the backbuffer are
  undefined.
  
Generic Implementation:
  As mentioned, if the OpenGL system can be confirmed to follow swap-copy then the
  requirements are satisfied with no additional support.

  If the OpenGL system can be confirmed to follow swap-exchange, then one way to
  enforce the persistent-backbuffer requirement is to, following a swapbuffers, copy
  the entire front-buffer to the back-buffer, or to redraw the entire back-buffer.
  This process can be optimized by noting that in the absence of window system damage
  to the front-buffer, the new back-buffer is the same as old front-buffer, which
  is the same as the back-buffer two swaps previously. Thus the state of the back-
  and front-buffers can be tracked, and typically only a small portion of the back-
  buffer needs to be fixed with a copy or a redraw following a swap.

  If the OpenGL system cannot be confirmed to follow swap-exchange, then copying
  the entire front-buffer to the back-buffer, or redrawing the entire back-buffer
  satisfy the given requirement, but this process cannot be further optimized and
  is likely to thus be prohibitively slow.

  An alternate method would be to preserve a copy of the back-buffer in memory
  using glReadPixels and glDrawPixels to restore, this has the advantage that following
  a partial redraw, only a small region of the back-buffer needs to be saved using
  glReadPixels. However, following a swapbuffers a glDrawPixels of the entire screen
  needs to be performed. This is not likely to be particularly speedy.

  The short answer is there is no good generic implementation if the system cannot
  be confirmed to follow either of the two swap methods. It is an interesting question
  if any such system exists. In general a compliant OpenGL implementation could
  switch swap-methods every time, or perhaps in demand to dynamically changing
  requirements from the OS or window system, but I do not know that any such system
  exists (it seems somewhat unlikely). If we assumed that any system would follow
  one of the two methods above, we could draw and swap the screen a few times and
  ascertain which method it was using, and dynamically choose the correct generic
  implementation.

OSX Implementation:
  As mentioned, OSX seems to always have swap-copy behavior by default, and an
  explicit flag is available to request an OpenGL context with this behavior. This
  does not guarantee that all systems will have such a context. It seems likely
  however.

Win32 Implementation:
  Windows provides a flag to *request* swap-copy behavior when obtaining an OpenGL
  context, but this is just a hint, and it is not clear if it is possible to confirm
  that the context returned actually does follow swap-copy behavior. Win32 OpenGL
  implementations also prevalently support the WGL_ARB_buffer_region extension,
  which can be used to efficiently implement the persistent-backbuffer behavior.

X11 Implementation:
  No better implementation than generic, additional OpenGL queries to determine
  swap behavior, or common extensions supporting implementation are known to exist.

Analysis:
  Despite the ambiguity of the swap method, it seems likely that in practice most
  systems could use a generic method, and on system that support swap-copy this
  method would be very high performance. Additionally, it appears to be the case
  that swap-copy behavior is preferred on modern graphics cards. 


High-level architecture
--
We leave the tender world of the swapbuffer for a moment to consider what
high-level reforms to the interface/drawing architecture are needed. 

The main task is getting rid of the frontbuffer drawing code, there are 
currently 44 reference to glDrawBuffer(GL_FRONT), and slightly but not too 
much more than that in indirect calls, for example to the xor line drawing 
or to functions that will update text on headers. Each of these needs to be 
converted to back-buffer drawing using the new scheme of choice.

In order to accomplish some other goals at the same time, I propose to do 
this by converting making all drawing stateful and centralizing all of the 
drawing code. This has some implications for the way the event subloops are
written, but does not require the elimination of all of them (a moderate
sized and potentially error prone task). However, by modifying the drawing
code in this way it will become easier to eliminate these subloops in the
future, in particular the drawing code should not need to be changed, and
part of the state that is in the event subloop will already have been
moved into other structures.

The main idea is that callbacks, event handlers, etc. will request update
of regions of the screen, which would typically be handled in an event based
fashion, but the event subloops can make a direct call to the function that
would handle drawing these regions of the screen when necessary.

This drawing function will serve as the centralized drawing function, no 
drawing should ever be performed outside of the call tree of this function.
This will also a eliminate a lot of the messiness associated with curarea,
areawinset, and all those other friendly fiends.

I have a sketch of a framework for this system here.
It also makes some accommodations for handling backing store for menus (saveunder
in the current codebase).

More on this to come pending experimentation...


Other note
--
One final note is, once the drawing code has been moved to this new system,
it is actually not that hard anymore to switch to a multiple context system.
The main work really just becomes extending GHOST to support multiple contexts,
and modifying the windowing code to deal with resizing contexts, etc.

This may turn out to be the nicest way to solve our problems with swapbuffers,
although it still would be nice to have support for a real subbuffer copy, in
order to support partial redraw (mostly for interface components), but also
for things like vertex selection for mesh editing.