Running with Code Like with scissors, only more dangerous

8Apr/090

Unsung C# Hero: Closure

Posted by Rob

Today I’m going to talk about a feature of C# that has been around since 2.0 (with the introduction of anonymous delegates) but which gets nearly no lip service and, despite the fact that most C# developers have probably used it, they’ve probably used it without thinking about it.  This feature is called closure, and it refers to the ability of a nested function to make reference to the surrounding function’s variables.

This article will make extensive discussion of how delegates are implemented in C#; a review may be appropriate before diving in.  Also, we’ll be making a lot of use of The Tool Formerly Known as Lutz Roeder’s .NET Reflector, which is now owned by Red Gate Software.

Anonymous Methods without Closure

Supposing that I had a reason to do so, I could assign an event handler as an anonymous method.  I think this is generally bad practice (there is no way to explicitly dissociate the event handler, because it doesn’t have a name), but you can:

    public partial class SampleNoClosure : Form
    {
        public SampleNoClosure()
        {
            InitializeComponent();

            button1.Click += delegate
            {
                MessageBox.Show("I was clicked!  See?");
            };
        }
    }

This will work as expected; on click, a small alert dialog will appear.  Nothing terribly special about that, right?  We could have written that as a lambda expression as well, not that it buys us anything.  It looks like this in Reflector:

Anonymous method with no closure.

We see that the compiler auto-generates a method that matches the appropriate signature.  Nothing here should be completely surprising.

Simple Example of Closure

Here is a sample class that includes closure.  The enclosed variable is sum.  You’ll note that everything just makes sense internally, right? 

    public partial class SimpleClosureExample : Form
    {
        public SimpleClosureExample()
        {
            InitializeComponent();

            int sum = 1;
            for (int i = 1; i <= 10; i++)
                sum += sum * i;

            button1.Click += delegate
            {
                MessageBox.Show("The sum was " + sum.ToString());
            };
        }
    }

So, it only makes sense that sum can be part of that anonymous function, right?  But we need to bear in mind that all C# code must be statically-compiled; we can’t just calculate sum.  Besides, what happens if the value was a parameter to the function?  Something that couldn’t be precompiled?  Well, in order to handle these scenarios, we need to think about how this will work.

In order to keep that method state alive, we need to create another object.  That’s how the state can be maintained regardless of threads and regardless of calls to the function.  We can see it as a nested class here, and the anonymous method looks just like it does in code:

Closure supporting class

A More Advanced Example

Whenever you work with a LINQ expression, chances are you’re using closure and anonymous functions (and lambda expressions) and don’t realize it.  Consider this LINQ-to-SQL query:

            int boardID = 811;
            int perPage = 20;
            int pageIndex = 0;

            var topics = (from topic in dc.Topics
                          orderby topic.IsSticky descending, topic.LastMessage.TimePosted descending
                          where topic.BoardID == boardID
                          select new
                          {
                              topic.TopicID,
                              Subject = topic.FirstMessage.Subject,
                              LatestSubject = topic.LastMessage.Subject,
                              LatestChange = topic.LastMessage.ModifiedTime,
                              NameOfUpdater = topic.LastMessage.PosterName,
                              Updater = topic.LastMessage.User,
                              Starter = topic.FirstMessage.User,
                              NameOfStarter = topic.FirstMessage.PosterName,
                              topic.ReplyCount,
                              topic.ViewCount
                          })
                            .Skip(perPage * pageIndex)
                            .Take(perPage);
            foreach (var topic in topics)
            {
                Console.WriteLine("{0} - {1} {2} {3} {4} by {5}", topic.Subject, topic.NameOfStarter, topic.ReplyCount, topic.ViewCount, topic.LatestChange, topic.NameOfUpdater);
            }

The closure here is happening within the where clause; you may recall that the C# where clause evaluates to the IEnumerable<T> extension method Where(Func<TSource, bool> predicate).

Here, it’s very easy to imagine a case where we wanted to write actual parameters.  This query is used to generate and display a topic list for a message board; all “stickied” posts should be at the top and the rest should be sorted by last time posted.  If I’m making that into a web server control, I’m going to need to not hard-code the board ID, the number of topics per page to display, and which page I’m looking at.

Now, this is kind of a hard thing to conceptualize; when I was going through this project, I expected all three variables to be incorporated into the class.  It turns out that Skip() and Take() don’t evaluate a lambda expression – they take straight values – so we don’t ultimately have to store them for evaluation later.  However, as expected, boardID does have to be stored, and that provides us with an interesting question of why.  And you might be asking why that is even the case; LINQ-to-SQL translates this into SQL for us:

SELECT TOP (20) [t0].[TopicID], [t2].[Subject], [t1].[Subject] AS [LatestSubject], [t1].[ModifiedTime] AS [LatestChange], [t1].[PosterName] AS [NameOfUpdater], [t4].[test], [t4].[UserID], [t4].[Username], [t4].[Email], [t4].[PasswordHash], [t6].[test] AS [test2], [t6].[UserID] AS [UserID2], [t6].[Username] AS [Username2], [t6].[Email] AS [Email2], [t6].[PasswordHash] AS [PasswordHash2], [t2].[PosterName] AS [NameOfStarter], [t0].[ReplyCount], [t0].[ViewCount]
FROM [dbo].[Topics] AS [t0]
LEFT OUTER JOIN [dbo].[Messages] AS [t1] ON [t1].[MessageID] = [t0].[LastMessageID]
LEFT OUTER JOIN [dbo].[Messages] AS [t2] ON [t2].[MessageID] = [t0].[FirstMessageID]
LEFT OUTER JOIN (
    SELECT 1 AS [test], [t3].[UserID], [t3].[Username], [t3].[Email], [t3].[PasswordHash]
    FROM [dbo].[Users] AS [t3]
    ) AS [t4] ON [t4].[UserID] = [t1].[UserID]
LEFT OUTER JOIN (
    SELECT 1 AS [test], [t5].[UserID], [t5].[Username], [t5].[Email], [t5].[PasswordHash]
    FROM [dbo].[Users] AS [t5]
    ) AS [t6] ON [t6].[UserID] = [t2].[UserID]
WHERE [t0].[BoardID] = @p0
ORDER BY [t0].[IsSticky] DESC, [t1].[TimePosted] DESC

So why, if we already have the SQL generated, do we need to bother with it?  Well, you may recall that LINQ-to-SQL doesn’t support all possible operators.  If we break support for the LINQ-to-SQL query and we have to pull back all of the relevant items, we’ll have to use that class.  At this point though, it goes unused.

Review

A closure is when you take the variables of a function and use them within a function declared inside of it – in C#, this is through anonymous delegates and lambda expressions.  C# typically will accomplish the use of closures by creating an implicit child class to contain the required state of the function as it executes, handing off the actual method to the contained class.

Further Reading

2Apr/090

Your Own Transactions with LINQ-to-SQL

Posted by Rob

I’m working on porting an existing forum-based community from SMF to a new .NET-based forum platform that I’m authoring.  I’m excited about it; I love SMF, but it doesn’t have what I want and frankly, it’s a scary beast to try to tackle.  I’d considered using some kind of bridge between it and my code, but I knew I wanted deep integration of the forums with the new community site, and I wanted the community site in .NET.  So I made the decision to write an importer to talk between MySQL and my SQL Server-based solution.  I chose LINQ-to-SQL as my O/R mapper because, quite frankly, I find it much easier and more elegant to work with; so far as I know, I’m not the only one who thinks so.

Because of the nature of the data that I’m importing, I needed to run several SubmitChanges() calls to get the data into the database.  But I wanted to make sure that these submissions only worked if they ALL worked.  So I needed a transaction external to the normal LINQ-to-SQL in-memory object mapper.  Unfortunately, when I began a transaction using the underlying Connection property of the DataContext, I was met with an error:

System.InvalidOperationException: SqlConnection does not support parallel transactions.
   at System.Data.SqlClient.SqlInternalConnection.BeginSqlTransaction(IsolationLevel iso, String transactionName)
   at System.Data.SqlClient.SqlInternalConnection.BeginTransaction(IsolationLevel iso)
   at System.Data.SqlClient.SqlConnection.BeginDbTransaction(IsolationLevel isolationLevel)
   at System.Data.Linq.DataContext.SubmitChanges(ConflictMode failureMode)

The solution was simple: DataContext has a Transaction property!  By setting this to the transaction that I was beginning, I was able to run the complete import in a single transaction:

dc.Connection.Open();
using (DbTransaction transaction = dc.Connection.BeginTransaction(IsolationLevel.ReadCommitted))
{
    dc.Transaction = transaction;
    try
    {
        // do databasey things
        dc.SubmitChanges();

        transaction.Commit();
    }
    catch (Exception ex)
    {
        transaction.Rollback();
        Console.WriteLine("Exception caught; transaction rolled back.");
        Console.WriteLine(ex.ToString());
    }
}

It took about 2 minutes to import 37,000 or so messages, plus all users, categories, forums, private messages, and polls from SMF.  The app ends up taking something in the neighborhood of 120mb of memory (I need to keep objects around to reference them for their children, since I assign new IDs), but it’s still a small one-off price to pay.

Tagged as: , , No Comments