Running with Code Like with scissors, only more dangerous

9Jan/121

Revealing Prototype Pattern: Pros and Cons

Posted by Rob Paveza

A short while ago, I wrote a post generally saying good things about the Revealing Prototype Pattern but mostly focused tearing down the other part that was presented with it, namely the way that variables were declared in a chain separated by the comma operator. This post will discuss some of the pros and cons of using this pattern, and give some examples about when you should or shouldn't use it.

Advantages

As Dan notes in his post, this features a significant improvement over straight prototype assignment (assignment of an object literal to a prototype), in that private visibility is supported. And because you're still assigning an object to the prototype, you are able to take advantage of prototypal inheritance. (Next time: How prototypal inheritance really works and how it can make your head explode).

Except for chaining his variable declarations, I have to admit Dan had me pretty well sold on the Revealing Prototype Pattern. It's very elegant; it provides good protection to its inner variables, and it uses a syntax we've been seeing more and more of ever since jQuery became a popular platform.

Unfortunately, it has some nasty drawbacks.

Disadvantages

To be fair, Dan lists some of the disadvantages about this pattern; however, he doesn't quite list them as such, and I think he was possibly unaware of some of their implications:

There's something interesting that happens with variables though, especially if you plan on creating more than one Calculator object in a page. Looking at the public functions you'll see that the 'this' keyword is used to access the currNumberCtl and eqCtl variables defined in the constructor. This works great since the caller of the public functions will be the Calculator object instance which of course has the two variables defined. However, when one of the public functions calls a private function such as setVal(), the context of 'this' changes and you'll no longer have access to the two variables.

The first time I read through that I glossed over the problems; I didn't quite understand the issue until I wrote some code. So let's do that - we'll implement the Java StringTokenizer class:

function StringTokenizer(srcString, delim)
{
    if (typeof srcString === 'undefined')
        throw new ReferenceError("Parameter 0 'srcString' is required.");
    if (typeof srcString !== 'string')
        srcString = srcString.toString();
    if (typeof delim !== 'string')
        delim = ' ';
    
    if (!(this instanceof StringTokenizer))    // enforce constructor usage
        return new StringTokenizer(srcString, delim);
        
    this.sourceString = srcString;
    this.delimiter = delim;
}
StringTokenizer.prototype = (function()
{
    var that = this;
    
    var _tokens = that.sourceString.split(that.delimiter);
    var _index = 0;
    
    var _countTokens = function() { return _tokens.length; };
    var _hasMoreTokens = function() { return _index < _tokens.length; };
    var _nextToken = function()
    {
        if (!_hasMoreTokens())
            return false;
        
        var next = _tokens[_index];
        _index += 1;
        return next;
    };
    var _reset = function() { _index = 0; };
    
    var resultPrototype = 
    {
        countTokens: _countTokens,
        hasMoreTokens: _hasMoreTokens,
        nextToken: _nextToken,
        reset: _reset
    };
    return resultPrototype;
})();

If you've ever written a jQuery plugin, you'll probably recognize what I did with the prototype assignment function; when writing jQuery plugins, it's common to close over the current instance of the jQuery object by assigning var that = $(this); so that you can write event-handler functions without losing access to the overall context. Unfortunately, what I did in this case is wrong; you may already see why.

var that = this;

In this context, this is a reference to the global object, not to the instance of the object - even though the prototype is being set. This is a generalization of what Dan said. Rewriting it to overcome it results in information leaking:

function StringTokenizer(srcString, delim)
{
    if (typeof srcString === 'undefined')
        throw new ReferenceError("Parameter 0 'srcString' is required.");
    if (typeof srcString !== 'string')
        srcString = srcString.toString();
    if (typeof delim !== 'string')
        delim = ' ';
    
    if (!(this instanceof StringTokenizer))    // enforce constructor usage
        return new StringTokenizer(srcString, delim);
        
    this.sourceString = srcString;
    this.delimiter = delim;
    this.tokens = srcString.split(delim);
    this.index = 0;
}
StringTokenizer.prototype = (function()
{
    var _countTokens = function() { return this.tokens.length; };
    var _hasMoreTokens = function() { return this.index < this.tokens.length; };
    var _nextToken = function()
    {
        if (!this.hasMoreTokens())
            return false;
        
        var next = this.tokens[this.index];
        this.index += 1;
        return next;
    };
    var _reset = function() { this.index = 0; };
    
    var resultPrototype = 
    {
        countTokens: _countTokens,
        hasMoreTokens: _hasMoreTokens,
        nextToken: _nextToken,
        reset: _reset
    };
    return resultPrototype;
})();

The code works correctly; but you can see that we have to make public all of the state variables we'll use in the constructor. (The alternatives are to either initialize the state variables in each function, where they would still be public; or to create an init function, which would still cause the variables to be public AND would require the user to know to call the init function before calling anything else).

Dan also indicated that you needed a workaround for private functions:

There are a few tricks that can be used to deal with this, but to work around the context change I simply pass “this” from the public functions into the private functions.

Personally, I prefer to try to avoid things one might call clever or tricky, because that's code for "so complex you can't understand it". But even in the case where you have a public function, you'll still get an error if you don't reference it via a public function call. This error is nonintuitive and could otherwise make you go on a bug hunt for a long time. Consider this change to the above code:

    var _hasMoreTokens = function() { return this.index < this.tokens.length; };
    var _nextToken = function()
    {
        if (!_hasMoreTokens())   // changed from:   if (!this.hasMoreTokens())
            return false;
        
        var next = this.tokens[this.index];
        this.index += 1;
        return next;
    };

Simply removing the 'this' reference in the caller is enough to cause 'this' to go out-of-scope in the _hasMoreTokens function. This is completely unintuitive behavior for developers who grew up in the classical inheritance model.

Alternatives

I wouldn't want to give you all of these options without giving you an alternative. The alternative I present here is one in which the entire object is populated in the constructor:

"use strict";
function StringTokenizer(srcString, delim)
{
    if (typeof srcString === 'undefined')
        throw new ReferenceError("Parameter 0 'srcString' is required.");
    if (typeof srcString !== 'string')
        srcString = srcString.toString();
    if (typeof delim !== 'string')
        delim = ' ';
    
    if (!(this instanceof StringTokenizer))    // enforce constructor usage
        return new StringTokenizer(srcString, delim);
        
    if (typeof Object.defineProperty !== 'undefined')
    {
        Object.defineProperty(this, 'sourceString', { value: srcString });
        Object.defineProperty(this, 'delimiter', { value: delim });
    }
    else
    {
        this.sourceString = srcString;
        this.delimiter = delim;
    }
    
    var _tokens = this.sourceString.split(this.delimiter);
    var _index = 0;
    
    var _countTokens = function() { return _tokens.length; };
    var _hasMoreTokens = function() { return _index < _tokens.length; };
    var _nextToken = function()
    {
        if (!_hasMoreTokens())
            return false;
        
        var next = _tokens[_index];
        _index += 1;
        return next;
    };
    var _reset = function() { _index = 0; };
    
    if (typeof Object.defineProperty !== 'undefined')
    {
        Object.defineProperty(this, 'countTokens', { value: _countTokens });
        Object.defineProperty(this, 'hasMoreTokens', { value: _hasMoreTokens });
        Object.defineProperty(this, 'nextToken', { value: _nextToken });
        Object.defineProperty(this, 'reset', { value: _reset });
    }
    else
    {
        this.countTokens = _countTokens;
        this.hasMoreTokens = _hasMoreTokens;
        this.nextToken = _nextToken;
        this.reset = _reset;
    }
}

The advantage of a structure like this one is that you always have access to this. (Note that this example is unnecessarily large because I've taken the additional step of protecting the properties with Object.defineProperty where it is supported). You always have access to private variables and you always have access to the state. The unfortunate side effect of this strategy is that it doesn't take advantage of prototypal inheritance (it's not that you can't do it with this strategy - more of that coming in the future) and that the entire private and public states (including functions) are closed-over, so you use more memory. Although, one may ask: is that really such a big deal in THIS sample?

Usage Considerations

The Revealing Prototype Pattern can be a good pattern to follow if you're less concerned with maintaining data integrity and state. You have to be careful with access non-public data and functions with it, but it's pretty elegant; and if you're working on a lot of objects, you have the opportunity to save on some memory usage by delegating the function definitions into the prototype rather than the specific object definition. It falls short, though, when trying to emulate classical data structures and enforce protection mechanisms. As such, it can require complicated or clever tricks to work around its shortcomings, which can ultimately lead to overly-complex or difficult-to-maintain code.

Like most patterns, your mileage may vary.

15Mar/110

The Microsoft Reactive Extensions

Posted by Rob Paveza

The honest truth is that I’m having difficulty establishing exactly what they could be used for, but they’re still really cool.  The Microsoft Reactive Extensions for the .NET Framework are the dual of LINQ: whereas LINQ operates over objects, or you might say pulls objects out of collections, the Reactive Extensions (Rx) handles push notifications.  It is the ultimate generalization of events and event handling within .NET.

Getting There

First, let’s consider the normal interfaces for IEnumerable:

interface IEnumerable<T>
{
    IEnumerator<T> GetEnumerator();
}

interface IEnumerator<T> : IDisposable
{
    T Current { get; }  // throws exception at end of enumeration
    bool MoveNext();
}

These interfaces (okay, really, the non-generic IEnumerable interface, but let’s not split hairs) are the foundation of the foreach C# keyword (and the For Each… In in Visual Basic).  A foreach can also be written, roughly, as:

foreach (string str in myListOfStrings)
    Console.WriteLine(str);
// rewritten:
using (IEnumerator<string> enumStr = myListOfStrings.GetEnumerator())
{
    while (enumStr.MoveNext())
    {
        Console.WriteLine(enumStr.Current);
    }
}

Keep this example in mind for later, because we’ll revisit how this can be used in Rx programming.

Dualism

Dualism is something of a mathematical concept, and I don’t want to get into it because I don’t completely understand it myself, but most nerdy people reading my blog will probably appreciate an example from particle physics.  Consider a proton: its physical dual is the antiproton (because when they meet they annhilate each other.  It’s not an electron, because while they have opposite charge, they have substantially different mass).

The core of Rx is the dual of IEnumerable.  That is, IObservable<T> and IObserver<T>.  But let’s deconstruct these piece by piece.  Let’s start at IEnumerator<T>:

interface IObserver<T>
{
    // T Current { get; }
    // That method looks like: T get_Current();
    void OnNext(T next);
    // Current throws an exception if MoveNext() previously returned false, so:
    void OnError(Exception error);

    // bool MoveNext() 
    // returns true while Current is populated, false when we reach the end, so:
    void OnDone();
}

You can see that, whereas everything in IEnumerator<T> pulled data, now we’ve transitioned into pushing data.  But the observer isn’t really the cool part; rather, it’s the subject that’s cool:

interface IObservable<T>
{
    // GetEnumerator() returned an object; here we pass one in
    // We still needed to capture the disposable functionality, so we return IDisposable
    IDisposable Subscribe(IObserver<T> observer);
}

Now, if you want to see the specifics about how these were constructed, you can check out the Expert-to-Expert video on Channel 9.  I’ve included some high-level notes, but they’re not really as deep as you can get with these guys.

Creating a Subject

Creating a subject is a bit of a challenge; subjects are event-driven, and those are generally kind of difficult to think about because the fit usually only into one of two buckets: user interaction and system I/O.  For sake of example, I’ve created a simple Windows Forms project to start with, that has a couple observable Buttons (the class is called ObservableButton, go figure), and an observer, which is the containing form.  You can download the starter project, which requires Visual Studio 2010 and the Rx Framework.

Subjects can be anything, though, and the power you can glean from these is amazing.  For the Red Bull NASCAR team, I created a server for a Twitter feed aggregator using Rx.  It started as reading a socket into HTTP data, then into chunked HTTP data, then into JSON packets, then into POCO objects that were then re-serialized and sent over the wire to N Flash clients.  As you can imagine, network programming, social programming, or any other kind of programming where an event is coming in unpredictably is a great candidate for this.  Why?

Let’s look at the use case I just listed.  As Twitter’s live stream service sends data over the wire, I need to parse it and send it to a lot of listening sockets.  But I don’t want to just say “Oh I just got the data, let me send it out again” – that would possibly slow down processing on other threads, because I might have to wait – my socket might already be in the process of sending data and so it’s in an invalid state to send further data.  If I had tied a server socket directly to the “I’m ready to send” signal directly, I would have been in trouble.  Rather, I had a utility (an Observer) that aggregated incoming messages until all server sockets were ready to send, at which point it would push those updated messages to the server sockets.

Let’s look at the sample program:

image

This isn’t really anything spectacular.  I could have done that with regular event handlers.

Aggregating Subjects

The magic of Rx, from my perspective, lies with what you can do with subjects.  I’m no longer initializing my constructor to require two lines – I’m merging the two buttons into one observable sequence:

        public Form1()
        {
            InitializeComponent();

            observableButton1.Merge(observableButton2).Subscribe(this);
        }

 

The result is identical – the events get handled and all is good.

Modifying Sequences

Now I’m going to change the class definition slightly:

    public partial class Form1 : Form, IObserver<Timestamped<string>>
    {
        public Form1()
        {
            InitializeComponent();

            observableButton1.Merge(observableButton2).Timestamp().Subscribe(this);
        }

        public void OnNext(Timestamped<string> value)
        {
            this.textBox1.Text += value.Timestamp.ToString("hh:mm tt   ") + value.Value + Environment.NewLine;
        }

        public void OnError(Exception error)
        {
            this.textBox1.Text += "Exception caught: " + Environment.NewLine + error.ToString() + Environment.NewLine;
        }

        public void OnCompleted()
        {
            this.textBox1.Text += "Sequence completed." + Environment.NewLine;
        }
    }

Note that by adding in the .Timestamp() call, I’ve transformed the observable to sequence of strings to be an observable sequence of timestamped strings.  That’s pretty cool, right?

This is even cooler: the Delay() method:

observableButton1.Merge(observableButton2).Timestamp()
                .Delay(new TimeSpan(0, 0, 1)).ObserveOn(this).Subscribe(this);

The ObserveOn method accepts a Windows Forms control, a Dispatcher (for WPF), or other scheduler implementation that can be used to synchronize the delay.  If I didn’t include it, the delayed merge would be called on a different thread, and we’d get an InvalidOperationException (because you can’t update a window on a thread other than the thread that created it). 

Do you want to avoid repetition?

            observableButton1.Merge(observableButton2).Timestamp()
                .DistinctUntilChanged(ts => ts.Value).Subscribe(this);

This produced output that only emitted one message, no matter how many times I clicked the same button, until I clicked the other button.

So, What Can We Do?

Well, right now it doesn’t seem like there’s a lot of tooling for Rx.  There’s a community wiki around the framework, though, and I think that we can eventually see a lot of good use.

Some ideas:

  • Develop a way to completely repeat ASP.NET requests.  Treat IIS as an IObservable<AspNetRequest>, where AspNetRequest contains all the state data that would otherwise populate these tools, which would immensely help with debugging.  Imagine when your tester only needs to record a series of test cases once, and otherwise is just testing for UI errors.
  • Wrap event-oriented APIs for simplified logging and replaying.  (In JinxBot, an event-oriented chat API named for my cat, I always wanted to capture all the events of the core API and be able to replay them via a subclass, which would have allowed pixel-perfect replay of a chat session).
  • Handle periodic data services like Twitter, SMS, email, or others in a clean and efficient way.

I’d like to see this take off, but it’s a very different way of looking at programming than what most .NET developers are used to.  Enjoy it, take a look, and let’s build it up!

10Aug/100

C# 4.0 and Dynamic Programming @ AZDNUG

Posted by Rob Paveza

I’m officially posting my code samples and slides BEFORE the presentation tonight so that I can actually claim that they’re online.  They are downloadable at http://robpaveza.net/pub/dynamics-presentation.zip – includes the project folder, the “building-up-the-code” part (including the IronPython code), and the slides.

The project requires Visual C# 2010 to open.  Executing the Python code will require IronPython 2.6 for .NET 4.0 (available here).  You may need to modify the paths in the Python code to make it work exactly right as it relies on some paths, and in fact, the Python code won’t run until you’ve build an executable for the Cs4Dynamics project.