Running with Code Like with scissors, only more dangerous


Thoughts on Generics and Best Practices – Building Cohesive Software

Posted by Rob

In general, I don't know whether I internalized this as a best practice because it makes sense to me, or because I read it somewhere.  But I have this notion that there are only certain places in which it is appropriate to expose a generic member to the outside world, such as when the item is clearly generic in scope.  For example, if I wanted to implement a FIFO list to get the behavior of a queue but the performance characteristics of a list, I might call my class ListQueue<T>.  As a library class, it makes sense to have this as a generic type.  But what about when it would save programmer effort only?  I don't think that this is a valid reason to expose the class to the outside world.

Between my last two projects, I have about a dozen classes that look like this:

   1:      [ConfigurationCollection(typeof(SerialPortConfigurationElement), AddItemName = "SerialPort", CollectionType = ConfigurationElementCollectionType.BasicMap)]
   2:      public class SerialPortConfigurationElementCollection : ConfigurationElementCollection, IEnumerable<SerialPortConfigurationElement>
   3:      {
   4:          protected override ConfigurationElement CreateNewElement()
   5:          {
   6:              return new SerialPortConfigurationElement();
   7:          }
   9:          protected override object GetElementKey(ConfigurationElement element)
  10:          {
  11:              return ((SerialPortConfigurationElement)element).PortName; 
  12:          }
  14:          public override ConfigurationElementCollectionType CollectionType
  15:          {
  16:              get
  17:              {
  18:                  return ConfigurationElementCollectionType.BasicMap;
  19:              }
  20:          }
  22:          public SerialPortConfigurationElement this[int index]
  23:          {
  24:              get { return BaseGet(index) as SerialPortConfigurationElement; }
  25:          }
  27:          public SerialPortConfigurationElement this[string index]
  28:          {
  29:              get { return BaseGet(index) as SerialPortConfigurationElement; }
  30:          }
  32:          #region IEnumerable<SerialPortConfigurationElement> Members
  34:          public new IEnumerator<SerialPortConfigurationElement> GetEnumerator()
  35:          {
  36:              return new LightWrapperEnumerator<SerialPortConfigurationElement>(base.GetEnumerator());
  37:          }
  39:          #endregion
  40:      }

This class enables the following type of configuration settings:

   1:      <SerialPorts>
   2:          <SerialPort PortName="COM6" BaudRate="9600" DataBits="8" Parity="None" StopBits="One" />
   3:          <SerialPort PortName="COM8" /> <!-- Only the PortName is required.  The others have defaults. -->
   4:      </SerialPorts>

This is all well and good.  However, as I look at my class, I see that it would be fairly easy to rip out a lot of that functionality and make it a generic class.  LightWrapperEnumerator<T> is already something like that - it's an internal-only class that basically wraps an IEnumerator into an IEnumerator<T>.  That's fairly easy to hide from the outside world, though; I only expose it to calling code using IEnumerator<T> and nobody is the wiser unless they call GetType() on it, which is unlikely.  I realized as I built the second duplicate of this class today involving a Motor configuration element, that I could rip out the guts and implement everything EXCEPT GetElementKey and have it out there:

   1:  abstract class ChildlessConfigurationElementCollection<T> : ConfigurationElementCollection 
   2:      where T : ConfigurationElement, new()
   3:  {
   4:      protected override ConfigurationElement CreateNewElement()
   5:      {
   6:          return new T();
   7:      }
   9:      public override ConfigurationElementCollectionType CollectionType
  10:      {
  11:          get
  12:          {
  13:              return ConfigurationElementCollectionType.BasicMap;
  14:          }
  15:      }
  17:      public T this[int index]
  18:      {
  19:          get { return BaseGet(index) as T; }
  20:      }
  22:      public T this[string index]
  23:      {
  24:          get { return BaseGet(index) as T; }
  25:      }
  27:      #region IEnumerable<SerialPortConfigurationElement> Members
  29:      public new IEnumerator<T> GetEnumerator()
  30:      {
  31:          return new LightWrapperEnumerator<T>(base.GetEnumerator());
  32:      }
  34:      #endregion
  35:  }

Implementing the serial port collection then, would be VERY easy:

   1:  [ConfigurationCollection(typeof(SerialPortConfigurationElement), AddItemName = "SerialPort", CollectionType = ConfigurationElementCollectionType.BasicMap)]
   2:  class SerialPortConfigurationElementCollection : ChildlessConfigurationElementCollection<SerialPortConfigurationElement>
   3:  {
   4:      protected override object GetElementKey(ConfigurationElement element)
   5:      {
   6:          return ((SerialPortConfigurationElement)element).PortName;
   7:      }
   8:  }

However, aside from the fact that they have the same "programmer functionality," there's not much motivating me to do anything else.

Hopefully, as you're looking at these two examples, you're saying "that doesn't look correct."  From an object model perspective, it doesn't make much sense, even in the abstract world (away from objects), to have even a concept of a "Childless Configuration Element Collection."  What does it mean, anyway?  Would it make sense for me to cast a configuration property object to a ChildlessConfigurationElementCollection of T?  I... don't think so.

I've been reading Code Complete 2nd Ed. lately.  I can't remember whether I've blogged about my respect for Steve McConnell before, but this guy is simply an amazing contributor to our craft.  In it, he talks at length about a software quality attribute that we'd never really discussed in school: cohesion.

The Wikipedia article gives it a great treatment, so I won't really delve into the depths of what cohesion means and how it can be measured (buy the book!).  As I look at this type of code, though, I can only see some kind of cross between procedural and logical cohesion - not exactly excellent in the scale.  The real bottom line is that these all operate on different data - how are we to treat them the same?

Incidentally, the way that C++ implements templates (not C++/CLI generics) would make this not be so terrible.  I could define this template class and it would actually generate the C++ code (Java actually would do this in an adequate way, as well).  Because the code is generated at compile-time, I don't need to expose the metadata about the template class, and so I'm saved in that regard.  Unfortunately, if I were to define this class in C# and need to access it in an external assembly, I'd need to mark the base class as public, and so everything else in the world can inherit from or create references to variables of type ChildlessConfigurationElementCollection<T>.  It doesn't make sense to have this type exposed; it only makes sense at the higher level. 

So I am left with the decision: do I duplicate code or expose bad, incomplete objects to the world?  I'll duplicate code, thank you very much.