Running with Code Like with scissors, only more dangerous


Speedy C#, Part 2: Optimizing Memory Allocations – Pooling and Reusing Objects

In C#, Visual Basic .NET, C++/CLI, J# – the list goes on – we’re freed from having to worry about our memory management.  Objects take care of themselves, and when the CLR garbage collector detects that an object is no longer in use, it frees the associated memory.  That doesn’t mean that we should run around allocating and deallocating objects all willy-nilly; in fact, since we have less control over memory, we arguably have the opportunity to be more careful with the way we use high-frequency objects.

Memory Regions in .NET

In .NET, and generally in most programming, we can think of two places in which we can access memory: the stack and the heap.  We can think of Stack memory as temporary workspace, or scratch space; when we leave a function, all of our stack goes away.  Way, way down in the machine architecture, the stack also stores the return addresses of functions.  The stack also stores function parameters.  It’s generally very orderly, inexpensive, but its volatile nature makes it a poor candidate for long-term storage.  In .NET, all types that derive from the ValueType class are stored on the stack unless they are boxed into an object reference; this includes types defined with the struct and enum keywords, as well as all of the primitive types except string (including int, double, and bool).

Heap memory is another matter.  The heap is a region of memory reserved for the use of the program and is intended to store objects that aren’t quite so transient.  This might be something like a database connection, a file or buffer, or a window.

The Enemy: Fragmentation

Over time, objects are allocated and eventually released, and because there’s not really any rhyme or reason, the heap becomes chaotic.  Allocations grab the first free block that’s large enough (sometimes larger than necessary) and hold onto it until they go into the abyss.  This leads to fragmentation – all the free memory must be tracked somehow, and here’s the real killer: contiguous blocks of free memory may not always be recognized as such.  Check this out: let’s say we have a heap allocated at memory location 0x4000 that is 32 bytes wide:

An un-used heap

(Yes, my awesome artwork was done with none other than Excel!)

Suppose we allocate an 8-byte object and another 8-byte object, then a 16-byte object.  The first is in red, the second in orange, and the third in gray:

The heap after it was filled

Now I’ll free the first and the third objects; we’ll have 24 bytes of total free memory:

A fragmented heap

Either we need to keep track of every little piece of memory, which might be the fastest algorithm for releasing but slow for allocating (not to mention potentially VERY wasteful), or try to come up with another solution.  This type of memory fragmentation is referred to as external fragmentation.

The Garbage Collector and Compaction

The garbage collector has two components: a reference counter and a compaction engine.  The reference counter is responsible for determining when objects no longer have references to them; this frees programmers from having to explicitly destroy objects (as is the practice in C++ with the delete operator, or in C with the free function).  A lazy thread is then able to release and compact memory as needed, avoiding much of the overhead of external fragmentation and also allowing unused memory to be reclaimed.  The garbage collector in .NET is generational; it checks the newest objects first (what are called "gen-0"), and if the newest objects are still in use, they get moved to gen-1.  If the memory pressure requires, gen-1 objects are evaluated, and if they are still in use, they get moved to gen-2.  Gen-2 objects are considered long-lasting, and are only checked when memory pressure is severe.

Let’s go back to our heap example; supposing I had an 8-byte, another 8-byte, and a 12-byte allocation, here’s my heap graph:

A new heap

Object 1 (in red) has gone out of scope, but objects 2 and three are sticking around.  Using normal memory freeing rules, the largest object that could be allocated would still only be 8 bytes, because that would be the largest contiguous free space.  However, using .NET’s compacting garbage collector, we could expect something along these lines:

A compacted heap graph

Here we can see we’ve dealt with the problem of external fragmentation by compacting the heap.  This convenience doesn’t come without a cost, though; while the garbage collector performed the compaction, all of your application threads were suspended.  The GC can’t guarantee object integrity if memory is getting abused during a garbage collection!

Preventing Compaction: Stop Killing off Objects!

Object pooling is a pattern to use that allows objects to be reused rather than allocated and deallocated, which helps to prevent heap fragmentation as well as costly GC compactions.  A pool can be thought of as an object factory; in essence, the most rudimentary pool could look like this:

   1: public class Pool<T> where T : new()
   2: {
   3:     private Stack<T> _items = new Stack<T>();
   4:     private object _sync = new object(); 
   6:     public T Get()
   7:     {
   8:         lock (_sync)
   9:         {
  10:             if (_items.Count == 0)
  11:             {
  12:                 return new T();
  13:             }
  14:             else
  15:             {
  16:                 return _items.Pop();
  17:             }
  18:         }
  19:     }
  21:     public void Free(T item)
  22:     {
  23:         lock (_sync)
  24:         {
  25:             _items.Push(item);
  26:         }
  27:     }
  28: }

Here, objects are created entirely on-demand and, when freed, are stored in a stack.  The reason we want to use a Stack is the performance characteristic of adding and removing objects; operations are always performed at the end of the list, which makes it highly efficient to add or remove items.  If possible, it may be prudent to pre-create a number of objects for use throughout the lifetime of your application. 

Here’s an example: the project I’ve been discussing lately uses a pool of byte arrays to handle incoming network messages received and sent via a Socket.  When pooling is enabled, over the course of the application’s lifetime, there were 17 Gen-0 collections, 5 Gen-1 collections, and 3 Gen-2 collections; a total of 270 byte[] instances were allocated, of which 44 were eligible for pooling and were pooled.  When pooling is disabled, there were 22 Gen-0 collections, 5 Gen-1 collections, and 3 Gen-2 collections; a total of 11,660 byte[] instances were allocated, of which approximately 10,900 were eligible for pooling.  That’s a lot of memory!

Summary – When and Why

Object pooling is a powerful optimization technique, and if you’re already using factory patterns it shouldn’t be terribly foreign to you.  The .NET Framework includes the ThreadPool class as part of System.Threading.  Other objects you might consider pooling are database connections, any expensive links to unmanaged code, or anything that needs to be allocated frequently and can then be thrown away.  In my example, byte arrays are exceptionally good for this because they can be overwritten easily.

Further Reading

The "Speedy C#" Series:

Comments (2) Trackbacks (0)
  1. Hi Rob,

    I came across your article ( while searching for ways to optimize C# object references and cache eviction algorithms.

    I read your article with interest and was wondering about the Pool class. Over the course of the lifetime of the app domain, unless I’m mistaken, you’re trading time released by less GC overhead with increased memory usage (all those pointers in Pool add up).

    Imagine a standard API where a large number of threads each query the API for objects (i.e. enumerate hierarchical relations etc.). The Pool would ensure that if one thread at some point had freed an object, it’d be available for other threads to pick up. If you’ve got 25 concurrent threads working, each asking for the same objects (which would be typical in large scale website scenarios – disregarding output caching here), then each object could potentially allocate up to additional 25x space for pointers.

    What kind of frequency are we talking about here, that justifies Pool and not just a ConcurrentDictionary?

    Thanks in advance,
    Anders Borum (Denmark)

  2. Hi Anders!

    I moved your comment to the analogous post here. Let me take your comment in two parts.

    First, regarding my approach: you’re correct – in this case I’m trading memory for speed. At any given time we may not be utilizing all of the objects in the Pool. However, I’m circumventing the need to allocate and deallocate these objects, as well as avoiding the need to compact my GC heap.

    Regarding the standard API: the scenario I believe you’re proposing is one in which identical (in meaning) objects are being requested, is this correct? For example, 25 users each ask for the homepage, which causes 25 identical homepage objects to be retrieved and served up? In this scenario, it is indeed wasteful; there are obviously better ways to cache these data than by using a simple pool without knowledge of the use of the objects.

    As with any tool in a programmer’s toolbelt, you have to consider what and how you’re using the tool. In the case that I’m describing, I needed to allocate byte arrays to later read, all internally, for my API, and then to surface objects that corresponded to these byte arrays. I allocated the objects that my API produced, but since I never needed to worry about object identity with the byte arrays (use them then get rid of them), it produced a substantial performance benefit to my application.

Leave a comment

ERROR: si-captcha.php plugin says GD image support not detected in PHP!

Contact your web host and ask them why GD image support is not enabled for PHP.

ERROR: si-captcha.php plugin says imagepng function not detected in PHP!

Contact your web host and ask them why imagepng function is not enabled for PHP.

No trackbacks yet.