Running with Code Like with scissors, only more dangerous

9Jan/121

Revealing Prototype Pattern: Pros and Cons

Posted by Rob Paveza

A short while ago, I wrote a post generally saying good things about the Revealing Prototype Pattern but mostly focused tearing down the other part that was presented with it, namely the way that variables were declared in a chain separated by the comma operator. This post will discuss some of the pros and cons of using this pattern, and give some examples about when you should or shouldn't use it.

Advantages

As Dan notes in his post, this features a significant improvement over straight prototype assignment (assignment of an object literal to a prototype), in that private visibility is supported. And because you're still assigning an object to the prototype, you are able to take advantage of prototypal inheritance. (Next time: How prototypal inheritance really works and how it can make your head explode).

Except for chaining his variable declarations, I have to admit Dan had me pretty well sold on the Revealing Prototype Pattern. It's very elegant; it provides good protection to its inner variables, and it uses a syntax we've been seeing more and more of ever since jQuery became a popular platform.

Unfortunately, it has some nasty drawbacks.

Disadvantages

To be fair, Dan lists some of the disadvantages about this pattern; however, he doesn't quite list them as such, and I think he was possibly unaware of some of their implications:

There's something interesting that happens with variables though, especially if you plan on creating more than one Calculator object in a page. Looking at the public functions you'll see that the 'this' keyword is used to access the currNumberCtl and eqCtl variables defined in the constructor. This works great since the caller of the public functions will be the Calculator object instance which of course has the two variables defined. However, when one of the public functions calls a private function such as setVal(), the context of 'this' changes and you'll no longer have access to the two variables.

The first time I read through that I glossed over the problems; I didn't quite understand the issue until I wrote some code. So let's do that - we'll implement the Java StringTokenizer class:

function StringTokenizer(srcString, delim)
{
    if (typeof srcString === 'undefined')
        throw new ReferenceError("Parameter 0 'srcString' is required.");
    if (typeof srcString !== 'string')
        srcString = srcString.toString();
    if (typeof delim !== 'string')
        delim = ' ';
    
    if (!(this instanceof StringTokenizer))    // enforce constructor usage
        return new StringTokenizer(srcString, delim);
        
    this.sourceString = srcString;
    this.delimiter = delim;
}
StringTokenizer.prototype = (function()
{
    var that = this;
    
    var _tokens = that.sourceString.split(that.delimiter);
    var _index = 0;
    
    var _countTokens = function() { return _tokens.length; };
    var _hasMoreTokens = function() { return _index < _tokens.length; };
    var _nextToken = function()
    {
        if (!_hasMoreTokens())
            return false;
        
        var next = _tokens[_index];
        _index += 1;
        return next;
    };
    var _reset = function() { _index = 0; };
    
    var resultPrototype = 
    {
        countTokens: _countTokens,
        hasMoreTokens: _hasMoreTokens,
        nextToken: _nextToken,
        reset: _reset
    };
    return resultPrototype;
})();

If you've ever written a jQuery plugin, you'll probably recognize what I did with the prototype assignment function; when writing jQuery plugins, it's common to close over the current instance of the jQuery object by assigning var that = $(this); so that you can write event-handler functions without losing access to the overall context. Unfortunately, what I did in this case is wrong; you may already see why.

var that = this;

In this context, this is a reference to the global object, not to the instance of the object - even though the prototype is being set. This is a generalization of what Dan said. Rewriting it to overcome it results in information leaking:

function StringTokenizer(srcString, delim)
{
    if (typeof srcString === 'undefined')
        throw new ReferenceError("Parameter 0 'srcString' is required.");
    if (typeof srcString !== 'string')
        srcString = srcString.toString();
    if (typeof delim !== 'string')
        delim = ' ';
    
    if (!(this instanceof StringTokenizer))    // enforce constructor usage
        return new StringTokenizer(srcString, delim);
        
    this.sourceString = srcString;
    this.delimiter = delim;
    this.tokens = srcString.split(delim);
    this.index = 0;
}
StringTokenizer.prototype = (function()
{
    var _countTokens = function() { return this.tokens.length; };
    var _hasMoreTokens = function() { return this.index < this.tokens.length; };
    var _nextToken = function()
    {
        if (!this.hasMoreTokens())
            return false;
        
        var next = this.tokens[this.index];
        this.index += 1;
        return next;
    };
    var _reset = function() { this.index = 0; };
    
    var resultPrototype = 
    {
        countTokens: _countTokens,
        hasMoreTokens: _hasMoreTokens,
        nextToken: _nextToken,
        reset: _reset
    };
    return resultPrototype;
})();

The code works correctly; but you can see that we have to make public all of the state variables we'll use in the constructor. (The alternatives are to either initialize the state variables in each function, where they would still be public; or to create an init function, which would still cause the variables to be public AND would require the user to know to call the init function before calling anything else).

Dan also indicated that you needed a workaround for private functions:

There are a few tricks that can be used to deal with this, but to work around the context change I simply pass “this” from the public functions into the private functions.

Personally, I prefer to try to avoid things one might call clever or tricky, because that's code for "so complex you can't understand it". But even in the case where you have a public function, you'll still get an error if you don't reference it via a public function call. This error is nonintuitive and could otherwise make you go on a bug hunt for a long time. Consider this change to the above code:

    var _hasMoreTokens = function() { return this.index < this.tokens.length; };
    var _nextToken = function()
    {
        if (!_hasMoreTokens())   // changed from:   if (!this.hasMoreTokens())
            return false;
        
        var next = this.tokens[this.index];
        this.index += 1;
        return next;
    };

Simply removing the 'this' reference in the caller is enough to cause 'this' to go out-of-scope in the _hasMoreTokens function. This is completely unintuitive behavior for developers who grew up in the classical inheritance model.

Alternatives

I wouldn't want to give you all of these options without giving you an alternative. The alternative I present here is one in which the entire object is populated in the constructor:

"use strict";
function StringTokenizer(srcString, delim)
{
    if (typeof srcString === 'undefined')
        throw new ReferenceError("Parameter 0 'srcString' is required.");
    if (typeof srcString !== 'string')
        srcString = srcString.toString();
    if (typeof delim !== 'string')
        delim = ' ';
    
    if (!(this instanceof StringTokenizer))    // enforce constructor usage
        return new StringTokenizer(srcString, delim);
        
    if (typeof Object.defineProperty !== 'undefined')
    {
        Object.defineProperty(this, 'sourceString', { value: srcString });
        Object.defineProperty(this, 'delimiter', { value: delim });
    }
    else
    {
        this.sourceString = srcString;
        this.delimiter = delim;
    }
    
    var _tokens = this.sourceString.split(this.delimiter);
    var _index = 0;
    
    var _countTokens = function() { return _tokens.length; };
    var _hasMoreTokens = function() { return _index < _tokens.length; };
    var _nextToken = function()
    {
        if (!_hasMoreTokens())
            return false;
        
        var next = _tokens[_index];
        _index += 1;
        return next;
    };
    var _reset = function() { _index = 0; };
    
    if (typeof Object.defineProperty !== 'undefined')
    {
        Object.defineProperty(this, 'countTokens', { value: _countTokens });
        Object.defineProperty(this, 'hasMoreTokens', { value: _hasMoreTokens });
        Object.defineProperty(this, 'nextToken', { value: _nextToken });
        Object.defineProperty(this, 'reset', { value: _reset });
    }
    else
    {
        this.countTokens = _countTokens;
        this.hasMoreTokens = _hasMoreTokens;
        this.nextToken = _nextToken;
        this.reset = _reset;
    }
}

The advantage of a structure like this one is that you always have access to this. (Note that this example is unnecessarily large because I've taken the additional step of protecting the properties with Object.defineProperty where it is supported). You always have access to private variables and you always have access to the state. The unfortunate side effect of this strategy is that it doesn't take advantage of prototypal inheritance (it's not that you can't do it with this strategy - more of that coming in the future) and that the entire private and public states (including functions) are closed-over, so you use more memory. Although, one may ask: is that really such a big deal in THIS sample?

Usage Considerations

The Revealing Prototype Pattern can be a good pattern to follow if you're less concerned with maintaining data integrity and state. You have to be careful with access non-public data and functions with it, but it's pretty elegant; and if you're working on a lot of objects, you have the opportunity to save on some memory usage by delegating the function definitions into the prototype rather than the specific object definition. It falls short, though, when trying to emulate classical data structures and enforce protection mechanisms. As such, it can require complicated or clever tricks to work around its shortcomings, which can ultimately lead to overly-complex or difficult-to-maintain code.

Like most patterns, your mileage may vary.

8Apr/090

Unsung C# Hero: Closure

Posted by Rob

Today I’m going to talk about a feature of C# that has been around since 2.0 (with the introduction of anonymous delegates) but which gets nearly no lip service and, despite the fact that most C# developers have probably used it, they’ve probably used it without thinking about it.  This feature is called closure, and it refers to the ability of a nested function to make reference to the surrounding function’s variables.

This article will make extensive discussion of how delegates are implemented in C#; a review may be appropriate before diving in.  Also, we’ll be making a lot of use of The Tool Formerly Known as Lutz Roeder’s .NET Reflector, which is now owned by Red Gate Software.

Anonymous Methods without Closure

Supposing that I had a reason to do so, I could assign an event handler as an anonymous method.  I think this is generally bad practice (there is no way to explicitly dissociate the event handler, because it doesn’t have a name), but you can:

    public partial class SampleNoClosure : Form
    {
        public SampleNoClosure()
        {
            InitializeComponent();

            button1.Click += delegate
            {
                MessageBox.Show("I was clicked!  See?");
            };
        }
    }

This will work as expected; on click, a small alert dialog will appear.  Nothing terribly special about that, right?  We could have written that as a lambda expression as well, not that it buys us anything.  It looks like this in Reflector:

Anonymous method with no closure.

We see that the compiler auto-generates a method that matches the appropriate signature.  Nothing here should be completely surprising.

Simple Example of Closure

Here is a sample class that includes closure.  The enclosed variable is sum.  You’ll note that everything just makes sense internally, right? 

    public partial class SimpleClosureExample : Form
    {
        public SimpleClosureExample()
        {
            InitializeComponent();

            int sum = 1;
            for (int i = 1; i <= 10; i++)
                sum += sum * i;

            button1.Click += delegate
            {
                MessageBox.Show("The sum was " + sum.ToString());
            };
        }
    }

So, it only makes sense that sum can be part of that anonymous function, right?  But we need to bear in mind that all C# code must be statically-compiled; we can’t just calculate sum.  Besides, what happens if the value was a parameter to the function?  Something that couldn’t be precompiled?  Well, in order to handle these scenarios, we need to think about how this will work.

In order to keep that method state alive, we need to create another object.  That’s how the state can be maintained regardless of threads and regardless of calls to the function.  We can see it as a nested class here, and the anonymous method looks just like it does in code:

Closure supporting class

A More Advanced Example

Whenever you work with a LINQ expression, chances are you’re using closure and anonymous functions (and lambda expressions) and don’t realize it.  Consider this LINQ-to-SQL query:

            int boardID = 811;
            int perPage = 20;
            int pageIndex = 0;

            var topics = (from topic in dc.Topics
                          orderby topic.IsSticky descending, topic.LastMessage.TimePosted descending
                          where topic.BoardID == boardID
                          select new
                          {
                              topic.TopicID,
                              Subject = topic.FirstMessage.Subject,
                              LatestSubject = topic.LastMessage.Subject,
                              LatestChange = topic.LastMessage.ModifiedTime,
                              NameOfUpdater = topic.LastMessage.PosterName,
                              Updater = topic.LastMessage.User,
                              Starter = topic.FirstMessage.User,
                              NameOfStarter = topic.FirstMessage.PosterName,
                              topic.ReplyCount,
                              topic.ViewCount
                          })
                            .Skip(perPage * pageIndex)
                            .Take(perPage);
            foreach (var topic in topics)
            {
                Console.WriteLine("{0} - {1} {2} {3} {4} by {5}", topic.Subject, topic.NameOfStarter, topic.ReplyCount, topic.ViewCount, topic.LatestChange, topic.NameOfUpdater);
            }

The closure here is happening within the where clause; you may recall that the C# where clause evaluates to the IEnumerable<T> extension method Where(Func<TSource, bool> predicate).

Here, it’s very easy to imagine a case where we wanted to write actual parameters.  This query is used to generate and display a topic list for a message board; all “stickied” posts should be at the top and the rest should be sorted by last time posted.  If I’m making that into a web server control, I’m going to need to not hard-code the board ID, the number of topics per page to display, and which page I’m looking at.

Now, this is kind of a hard thing to conceptualize; when I was going through this project, I expected all three variables to be incorporated into the class.  It turns out that Skip() and Take() don’t evaluate a lambda expression – they take straight values – so we don’t ultimately have to store them for evaluation later.  However, as expected, boardID does have to be stored, and that provides us with an interesting question of why.  And you might be asking why that is even the case; LINQ-to-SQL translates this into SQL for us:

SELECT TOP (20) [t0].[TopicID], [t2].[Subject], [t1].[Subject] AS [LatestSubject], [t1].[ModifiedTime] AS [LatestChange], [t1].[PosterName] AS [NameOfUpdater], [t4].[test], [t4].[UserID], [t4].[Username], [t4].[Email], [t4].[PasswordHash], [t6].[test] AS [test2], [t6].[UserID] AS [UserID2], [t6].[Username] AS [Username2], [t6].[Email] AS [Email2], [t6].[PasswordHash] AS [PasswordHash2], [t2].[PosterName] AS [NameOfStarter], [t0].[ReplyCount], [t0].[ViewCount]
FROM [dbo].[Topics] AS [t0]
LEFT OUTER JOIN [dbo].[Messages] AS [t1] ON [t1].[MessageID] = [t0].[LastMessageID]
LEFT OUTER JOIN [dbo].[Messages] AS [t2] ON [t2].[MessageID] = [t0].[FirstMessageID]
LEFT OUTER JOIN (
    SELECT 1 AS [test], [t3].[UserID], [t3].[Username], [t3].[Email], [t3].[PasswordHash]
    FROM [dbo].[Users] AS [t3]
    ) AS [t4] ON [t4].[UserID] = [t1].[UserID]
LEFT OUTER JOIN (
    SELECT 1 AS [test], [t5].[UserID], [t5].[Username], [t5].[Email], [t5].[PasswordHash]
    FROM [dbo].[Users] AS [t5]
    ) AS [t6] ON [t6].[UserID] = [t2].[UserID]
WHERE [t0].[BoardID] = @p0
ORDER BY [t0].[IsSticky] DESC, [t1].[TimePosted] DESC

So why, if we already have the SQL generated, do we need to bother with it?  Well, you may recall that LINQ-to-SQL doesn’t support all possible operators.  If we break support for the LINQ-to-SQL query and we have to pull back all of the relevant items, we’ll have to use that class.  At this point though, it goes unused.

Review

A closure is when you take the variables of a function and use them within a function declared inside of it – in C#, this is through anonymous delegates and lambda expressions.  C# typically will accomplish the use of closures by creating an implicit child class to contain the required state of the function as it executes, handing off the actual method to the contained class.

Further Reading