Friday, June 8, 2012

Deprecating the Observer Pattern

This is the title of the 2010 paper by Martin Odersky (Scala creator, Java compiler author), Ingo Maier and Tiark Rompf. It shows how the ideas of reactive programming we discussed in the previous article can be implemented in Scala. Here is an excerpt from it:


Programming interactive systems by means of the observer pattern is hard and error-prone yet is still the implementation standard in many production environments. We present an approach to gradually deprecate observers in favor of reactive programming abstractions. Several library layers help programmers to smoothly migrate existing code from callbacks to a more declarative programming model. Our central high-level API layer embeds an extensible higher-order data-flow DSL into our host language. This embedding is enabled by a continuation passing style transformation.

General Terms: Design, Languages
Keywords: data-flow language, reactive programming, user interface programming, Scala

1. Introduction

Over the past decades, we have seen a continuously increasing demand in interactive applications, fueled by an ever growing number of non-expert computer users and increasingly multimedia capable hardware. In contrast to traditional batch mode programs, interactive applications require a considerable amount of engineering to deal with continuous user input and output. Yet, our programming models for user interfaces and other kinds of continuous state interactions have not changed much. The predominant approach to deal with state changes in production software is still the observer pattern [25]. We hence have to ask: is it actually worth bothering?

For an answer on the status quo in production systems, we quote an Adobe presentation from 2008 [43]:
  • 1/3 of the code in Adobe’s desktop applications is devoted to event handling logic
  • 1/2 of the bugs reported during a product cycle exist in this code
Our thesis is that these numbers are bad for two reasons. First, we claim that we can reduce event handling code by at least a factor of 3 once we replace publishers and observers with more appropriate abstractions. Second, the same abstractions should help us to reduce the bug ratio in user interface code to bring it at least on par with the rest of the application code. In fact, we believe that event handling code on average should be one of the least error-prone parts of an application.

But we need to be careful when we are talking about event handling code or logic. With these terms, we actually mean code that deals with a variety of related concepts such as continuous data synchronization, reacting to user actions, programming with futures and promises [33] and any blend of these. Event handling is merely a common means to implement those matters, and the usual abstractions that are employed in event handling code are callbacks such as in the observer pattern.

To illustrate the precise problems of the observer pattern, we start with a simple and ubiquitous example: mouse dragging. The following example traces the movements of the mouse during a drag operation in a path object and displays it on the screen. To keep things simple, we use Scala closures as observers.
var path: Path = null
val moveObserver = { (event: MouseEvent) =>
control.addMouseDownObserver { event =>
  path = new Path(event.position)

control.addMouseUpObserver { event =>
The above example, and as we will argue the observer pattern as defined in [25] in general, violates an impressive line-up of important software engineering principles:

Observers promote side-effects. Since observers are stateless, we often need several of them to simulate a state machine as in the drag example. We have to save the state where it is accessible to all involved observers such as in the variable path above.
As the state variable path escapes the scope of the observers, the observer pattern breaks encapsulation.
Multiple observers form a loose collection of objects that deal with a single concern (or multiple, see next point). Since multiple observers are installed at different points at different times, we can’t, for instance, easily dispose them altogether.
Separation of concerns
The above observers not only trace the mouse path but also call a drawing command, or more generally, include two different concerns in the same code location. It is often preferable to separate the concerns of constructing the path and displaying it, e.g., as in the model-view-controller (MVC) [30] pattern.
We could achieve a separation of concerns in our example by creating a class for paths that itself publishes events when the path changes. Unfortunately, there is no guarantee for data consistency in the observer pattern. Let us suppose we would create another event publishing object that depends on changes in our original path, e.g., a rectangle that represents the bounds of our path. Also consider an observer listening to changes in both the path and its bounds in order to draw a framed path. This observer would manually need to determine whether the bounds are already updated and, if not, defer the drawing operation. Otherwise the user could observe a frame on the screen that has the wrong size (a glitch).
Different methods to install different observers decrease code uniformity.
There is a low level of abstraction in the example. It relies on a heavyweight interface of a control class that provides more than just specific methods to install mouse event observers. Therefore, we cannot abstract over the precise event sources. For instance, we could let the user abort a drag operation by hitting the escape key or use a different pointer device such as a touch screen or graphics tablet.
Resource management
An observer’s life-time needs to be managed by clients. Because of performance reasons, we want to observe mouse move events only during a drag operation. Therefore, we need to explicitly install and uninstall the mouse move observer and we need to remember the point of installation (control above).
 Semantic distance
Ultimately, the example is hard to understand because the control flow is inverted which results in too much boilerplate code that increases the semantic distance between the programmers intention and the actual code.
Mouse dragging, which already comes in large varieties, is just an example of the more general set of input gesture recognition. If we further generalize this to event sequence recognition with (bounded or unbounded) loops, all the problems we mentioned above still remain. Many examples in user interface programming are therefore equally hard to implement with observers, such as selecting a set of items, stepping through a series of dialogs, editing and marking text – essentially every operation where the user goes through a number of steps.

1.1 Contributions and Overview

Our contributions are:
  • We show how to integrate composable reactive programming abstractions into a statically typed programming language that solve the problems of the observer pattern. To our knowledge, Scala.React is the first system that provides several API layers allowing programmers to stepwise port observer-based code to a data-flow programming model.
  • We demonstrate how an embedded, extensible data-flow language provides the central foundation for a composable variant of observers. It further allows us to easily express first-class events and time-varying values whose precise behavior change over time.
  • The embedded data-flow language can make use of the whole range of expressions from our host language without explicit lifting.We show how this can be achieved by the use of delimited continuations in the implementation of our reactive programming DSL.

In the following, we start with the status quo of handling events with callbacks and gradually introduce and extract abstractions that eventually address all of the observer pattern issues we identified above. Ultimately, we will arrive at a state where we make efficient use of object-oriented, functional, and data-flow programming principles. Our abstractions fit nicely into an extensible inheritance hierarchy, promote the use of immutable data and let clients react to multiple event sources without inversion of control.


7. Related Work

Some production systems, such as Swing and other components of the Java standard libraries, closely follow the observer pattern as formalized in [25]. Others such as Qt and C# go further and integrate uniform event abstractions as language extensions [32, 40, 47]. F# additionally provides first-class events that can be composed through combinators. The Rx.Net framework can lift C# language-level events to first-class event objects (called IObservables) and provides FRP-like reactivity as a LINQ library [36, 46]. Its precise semantics, e.g. whether glitches can occur, is presently unclear to us. Systems such as JavaFX [42], Adobe Flex [2] or JFace Data Binding [45] provide what we categorize as reactive data binding: a variable can be bound to an expression that evaluates to the result of that expression until it is rebound. In general, reactive data binding systems are pragmatic approaches that usually trade data consistency guarantees (glitches) and first-class reactives for a programming model that integrates with existing APIs. Flex allows embedded Actionscript [37] inside XML expressions to create data bindings. JavaFX’s use of reactivity is tightly integrated into the language and transparent in that ordinary object attributes can be bound to expressions. Similar to Scala.React, JFace Data Binding establishes signal dependencies through thread local variables but Java’s lack of closures leads to a more verbose data binding syntax. JFace is targeted towards GUI programming and supports data validation and integrates with the underlying Standard Widget Toolkit (SWT).

7.1 Functional Reactive Programming

Scala.React’s composable signals and event streams originate in functional reactive programming (FRP) which goes back to Conal Elliot’s Fran [21].
7.5 Actors

Data-flow reactives and reactors share certain similarities with actors [3, 28], which are used as concurrency abstractions.
In summary, while an actor sends messages to certain actors chosen by itself, it reacts to incoming messages from arbitrary actors. For data-flow reactives, the converse is true. They send messages to the public and react to messages from sources chosen by itself. Both actors and data-flow reactives simulate state machines, i.e., they encapsulate internal state. The major difference is that state transitions and data availability are synchronized among reactives, whereas actors behave as independent units of control.

8. Conclusion

We have demonstrated a new method backed by a set of library abstractions that allows a gradual transition from classical event handling with observers to reactive programming. The key idea is to use a layered API that starts with basic event handling and ends in an embedded higher-order dataflow language. In the introduction, we identified many software engineering principles that the observer pattern violates. To summarize, our system addresses those violations as follows:
We see a similar correspondence between imperative and functional programming in Scala and data-flow and functional reactive programming in Scala.React. Imperative programming is close to the (virtual) machine model and used to implement functional collections and combinators. Dataflow programming in our system is a simple extension to Scala’s imperative core and can be readily used to implement reactive abstractions and combinators as we have shown. Programmers can always revert to reactors and low-level observers in case a data-flow oriented or combinatorial solution is not obvious.

Given our previously identified issues of the observer pattern for which we are now providing a gradual path out of the misery, we have to ask: Is the observer pattern becoming an anti-pattern?


In the next article you can find a practical tutorial for learning reactive programming in the browser with Scala and Google Web Toolkit.

No comments:

Post a Comment