Running with Code Like with scissors, only more dangerous

1Aug/080

My C# 4.0 Wishlist Part 6: Automatic Properties for Enum Variables

Posted by Rob

OK, so I lied; I'm not stopping at 5 parts.

I've been working with enumerations frequently lately; the Battle.net chat protocol is binary and therefore the values that come over the wire have different contextual meanings based on the values that might have preceded them.  For example, a chat message event actually can have about a dozen meanings; it can be a server-broadcasted message, a message from another user, or just an announcement that a user joined the channel.  In addition to the standard values identifying things like message type, messages typically have one form or another of flags; if the event is based on a user, the flags contain information about the user's status on the server (whether the user is an administrator or has operator privileges in the channel).  Others, such as channel information updates, contain information about the chat channel itself, such as whether it is public, silent, or otherwise normal.

The Problem

Having had to deal with enumerations frequently has made me hate code like this:

   1: if (((UserFlags)e.Flags & UserFlags.ChannelOperator) == UserFlags.ChannelOperator)

Especially when working with bitwise values (enumerations decorated with the [Flags] attribute), because of the specific operator precedence constraints that C# places on the developer, this becomes annoying quickly.  So much so, that classes where I have to do that frequently end up with several TestFlag() methods, but even these are limited.  Consider code like this:

   1: bool TestFlag(UserFlags test, UserFlags reference) { ... }
   2: bool TestFlag(ChannelFlags test, ChannelFlags reference { ... }

Or this:

   1: bool TestFlag<T>(T test, T reference) {
   2:  // hard to implement since no meaningful type constraint can be placed on T
   3: }

Or this:

   1: bool TestFlag(int test, int reference) { ... }

In proposition 1 we have to implement n methods, either repeatedly or in a globally-defined, internal utility class; that stinks.  Proposition 2 is difficult to implement; we can't place a type constraint because C# doesn't allow enum type constraints, and since enums have a type constraint themselves of always being an integral value, this would be ideal; but type constraints in this case are limited to struct, which doesn't guarantee operator | or operator &.  In proposition 3, every time we want to test, we need to cast to int (or long) and lose type information.  I guess that works, but then you worry that you end up with code like this:

   1: if (TestFlag((int)e.User.Flags, (int)UserFlags.ServerAdministrator))
   2: { 
   3:     // ...
   4: } 
   5: else if (TestFlag((int)e.User.Flags, (int)UserFlags.ChannelOperator))
   6: {
   7:     // ...
   8: } // ...

No, there's a cleaner solution, and, like the compiler features added to C# 3.0, it doesn't require a new CLR: automatic properties on enumerations.

The Solution

Internally, enumerations are treated as their base numeric type by the CLR; the variable itself carries around type information, but it's not strong and can be changed by direct casting.  But the compiler always knows the type of a local variable and can apply it directly.  So, consider applying a property to an enumeration variable called IsEnumField.  Consider this [Flags] enumeration, and look at the code that uses it when using this style of coding:

   1: if (e.User.Flags.IsNone)
   2: {   } 
   3: else if (e.User.Flags.IsBlizzardRepresentative || e.User.Flags.IsBattleNetAdministrator)
   4: {   }
   5: else if (e.User.Flags.IsChannelOperator
   6: {   }
   7: else if (e.User.Flags.IsNoUDP)
   8: {   }

We can easily identify the pattern that the compiler supports; prefix "Is" to the field name and perform the underlying logic.

The great part about this solution is that the emitted code is exactly the same as what you or I would produce right now.  So the compiler can know by its clever compiler tricks to do this:

   1: if (e.User.Flags == UserFlags.None) {   }
   2: else if ((e.User.Flags & UserFlags.BlizzardRepresentative) == UserFlags.BlizzardRepresentative
   3:        || (e.User.Flags & UserFlags.BattleNetAdministrator) == UserFlags.BattleNetAdministrator) {   }
   4: else if ((e.User.Flags & UserFlags.ChannelOperator) == UserFlags.ChannelOperator) {   }
   5: else if ((e.User.Flags & UserFlags.NoUDP) == UserFlags.NoUDP) {   }

In that example, I qualified the type name UserFlags nine times.  Can you say "carpal tunnel"?

Future-Proofing

There are some considerations to make about this.  First, there are already going to be some enumerations in the wild with field names that begin with "Is," and it could very easily raise confusion if someone sees code such as user.Flags.IsIsOnline.  Fortunately, the solution is equally simple: create a decorator attribute, just like we did for extension methods:

   1: namespace System
   2: {
   3:     [AttributeUsage(AttributeTargets.Enum)]
   4:     public sealed class EnumPropertiesAttribute : Attribute { }
   5: }

Then, when you create an enumeration that you'd like to expose these style of properties, simply decorate the enumeration with this attribute.  IntelliSense knows to show the properties, the compiler knows to translate the properties, and we're in the free and clear.

Wouldn't it be great?

The C# 4.0 Wishlist series

27Jan/080

My C# 4.0 Wishlist, Part 5 : The raise Keyword

Posted by Rob

One of the more obscure features of C# is the ability to specify custom overloads for adding and removing event registration similarly to properties, via the add and remove keywords.  Known as "event accessors," they implement the parts of event registration that the C# compiler normally handles.  You didn't think that that += operator was implemented on the type, did you?

   1:  class Test
   2:  {
   3:      public event EventHandler Event1;
   4:   
   5:      private EventHandler ev2;
   6:      public event EventHandler Event2
   7:      {
   8:          add
   9:          {
  10:              if (ev2 != null)
  11:                  ev2 = (EventHandler)Delegate.Combine(ev2, value);
  12:              else
  13:                  ev2 = value;
  14:          }
  15:          remove
  16:          {
  17:              if (ev2 != null)
  18:                  ev2 = (EventHandler)Delegate.Remove(ev2, value);
  19:          }
  20:      }
  21:      protected virtual void OnEvent2(EventArgs e)
  22:      {
  23:          if (ev2 != null)
  24:              ev2(this, e);
  25:      }
  26:  }
  27:   

This pattern is actually used extensively throughout the Windows Forms library, where controls add event handlers to base event handler collections implemented within a hashtable.  I can only surmise that this is done to prevent having dozens of event fields cluttering up the classes.

Now, if we were to compile this app and disassemble it in Reflector, we'd get a very similar picture to what we've got.  Reflector would show the compiler-generated add/remove blocks for Event1, though not when the event declaration is selected, and it also indicates that there are compiler directives that show the event accessors are synchronized.

Visual Basic .NET also supports this pattern, but adds an additional keyword: the RaiseEvent keyword:

   1:  Public Class Test
   2:      Public Event Event1 As EventHandler
   3:   
   4:      Private ev2 As EventHandler
   5:      Public Custom Event Event2 As EventHandler
   6:          AddHandler(ByVal value As EventHandler)
   7:              If Not ev2 Is Nothing Then
   8:                  ev2 = CType(System.Delegate.Combine(ev2, value), EventHandler)
   9:              Else
  10:                  ev2 = value
  11:              End If
  12:          End AddHandler
  13:   
  14:          RemoveHandler(ByVal value As EventHandler)
  15:              If Not ev2 Is Nothing Then
  16:                  ev2 = CType(System.Delegate.Remove(ev2, value), EventHandler)
  17:              End If
  18:          End RemoveHandler
  19:   
  20:          RaiseEvent(ByVal sender As Object, ByVal e As System.EventArgs)
  21:              ev2(sender, e)
  22:          End RaiseEvent
  23:      End Event
  24:   
  25:      Protected Overridable Sub OnEvent2(ByVal e As EventArgs)
  26:          If Not ev2 Is Nothing Then
  27:              RaiseEvent Event2(Me, e)
  28:          End If
  29:      End Sub
  30:  End Class

In this example, Visual Basic allows you to implement exactly how Event2 is raised.  When I look at this in Reflector to see how C# uses this, here's what I see:

Reflector view of custom VB event

Reflector gives C# the raise keyword.  Why haven't the C# language experts done so?

How would this be worthwhile?  Well, suppose that we're building an application that can have plugins.  We don't know that plugins are always going to work correctly, so when they handle an event, they may raise an exception.  The problem is, if an event is invoked and the first event handler causes an exception, none of the successive handlers will be invoked.

Arguably, the "state of the application is undefined after an exception is raised, so we should gracefully exit."  But that's not always the case!  What if the way to gracefully do this is to analyze the stack trace within the application, determine which plugin caused the exception, and unload the plugin?  We can't do any of this from C#.

Give us the raise keyword!

This is the end of my "C# 4.0 Wishlist" series.  For reference, here are the other articles:

25Jan/080

My C# 4.0 Wishlist, Part 4 : Constant typeof() Expressions

Posted by Rob

Along with some of the hacks I introduced into ShinyDesign, there was a problem using a generic parameter as an enum - I couldn't cast it back to an integral type, even System.UInt64, because T was not guaranteed to be an integral value (yet again why we should allow a type constraint, but I digress).

In any case, there have been cases where I'd like to, for instance, switch against a Type, particularly since incorporating generics.  Consider:

   1:  switch (typeof(T).GetUnderlyingType())
   2:  {
   3:      case typeof(byte):
   4:      case typeof(sbyte):
   5:          break;
   6:      case typeof(short):
   7:      case typeof(ushort):
   8:          break;
   9:      case typeof(int):
  10:      case typeof(uint):
  11:          break;
  12:      case typeof(long):
  13:      case typeof(ulong):
  14:          break;
  15:  }

This is MUCH cleaner than the alternative, current implementation:

   1:  Type t = typeof(T).GetUnderlyingType();
   2:  if (t == typeof(byte) || t == typeof(sbyte))
   3:  { }
   4:  else if (t == typeof(short) || t == typeof(ushort))
   5:  { } 
   6:  else if (t == typeof(int) || t == typeof(uint))
   7:  { } 
   8:  else if (t == typeof(long) || t == typeof(ulong))
   9:  { }

So this is a working example of how the syntax would be cleaner by allowing us to use the typeof expression result as a constant value.  If you've never tried this, the compiler complains.  Given this code:

 155:  switch (t)
 156:  {
 157:      case typeof(int):
 158:      case typeof(uint):
 159:          break;
 160:  }

I get:

EnumTypeConverter.cs(155,21): error CS0151: A value of an integral type expected

EnumTypeConverter.cs(157,22): error CS0150: A constant value is expected

EnumTypeConverter.cs(158,22): error CS0150: A constant value is expected

I'm sure you've switched over a string, though - it's one of the nice syntactical features of C#.  You might be wondering why, if switching over a string is possible, then why not a Type?

Switching on a string doesn't switch on a string - it shoots the strings into a Dictionary<string, int>, stores the offsets, and then uses a jump table with the IL switch instruction:

image

Yeah, obviously there's a lot of opportunity to misuse the typeof expressions.  But there are going to be legit uses, too, and honestly - if C# can have a compiler trick for strings, it can have a compiler trick for types.  And let's be honest - typeof() expressions aren't ever going to return different values for the same app (that's why people were locking types to synchronize across an AppDomain). 

This - like the inability to constrain a type constraint to an enum - is an artificial constraint that really shouldn't be there.

17Jan/080

My C# 4.0 Wishlist, Part 3: The Return of Const-ness

Posted by Rob

In C++, I can decorate member functions with the const modifier, which indicates that calling the member function will not modify the internal state of the object.  Here's a sample class definition:

Test.h:

   1:  class CTest
   2:  {
   3:  private:
   4:      int m_nVal;
   5:   
   6:  public:
   7:      CTest(void);
   8:      ~CTest(void);
   9:      int GetValue() const;
  10:      void SetValue(int value);
  11:      int Add(int value) const;
  12:  };

Test.cx:

   1:  #include "Test.h"
   2:   
   3:  CTest::CTest(void)
   4:  {
   5:  }
   6:   
   7:  CTest::~CTest(void)
   8:  {
   9:  }
  10:   
  11:  int CTest::Add(int value) const
  12:  {
  13:      return value + m_nVal;
  14:  }
  15:   
  16:  int CTest::GetValue() const
  17:  {
  18:      return m_nVal;
  19:  }
  20:   
  21:  void CTest::SetValue(int value) 
  22:  {
  23:      m_nVal = value;
  24:  }

This example demonstrates wrapping an integer value, and shows how GetValue() and Add() can be const by not modifying any internal values.  Now, if I change the Add method to a void type, and add the value to the internal state, I get a compiler error.  Here's the updated method:

   1:  void CTest::Add(int value) const
   2:  {
   3:      return SetValue(value + m_nVal);
   4:  }

Error:

error C2662: 'CTest::SetValue' : cannot convert 'this' pointer from 'const CTest' to 'CTest &'

I get a similar error (about lvalue type casting) if I just set the value within the Add method.

So how should this apply in C#?  Realistically, I think I'd like it to just apply to member functions and properties.  There are a lot of ways to use const in C and C++ - it's almost scary, actually (could you imagine using one parameter and having three const modifiers?).  In C#, I'd just like it to be part of the method contract:

   1:  public class Class1
   2:  {
   3:      private string m_firstName, m_lastName;
   4:      private int m_val;
   5:   
   6:      public int Value
   7:      {
   8:          get const
   9:          {
  10:              return m_val;
  11:          }
  12:          set
  13:          {
  14:              m_val = value;
  15:          }
  16:      }
  17:   
  18:      public string GetName() const
  19:      {
  20:          return string.Format("{0}, {1}", m_lastName, m_firstName);
  21:      }
  22:  }

In both of these examples, we can tell that the internal state of the object itself isn't modified (note that the const modifier only applies to the get method of the Value property).  It provides the user of the class additional information, and it helps to enforce the contract on the side of the class author.

Implementation in the compiler: add a System.Runtime.CompilerServices.ConstMethodAttribute and apply it to the methods as marked.  Add a static code analysis rule that checks to see if a method could be marked as const, and if so, flag a warning.

I don't know that there are compiler optimizations that can be made, but one way or another, I think that it's a good method with which to give additional information about method implementations.  Sometimes we don't want to call properties or methods if we know that it can cause side effects, because let's be honest: the base class library's documentation isn't always 100% clear.  That's why we need tools like .NET Reflector.  One more tool to help our code be self-documenting is one more good thing.

15Jan/080

My C# 4.0 Wishlist, Part 1 : Eliminate Type Constraint Constraints

Posted by Rob

C# 2.0 introduced a great new feature to the .NET Type system: generics.  Generics are really cool in that they allow you to define template classes; I can use a single class definition to provide a strongly typed collection, for example.  They enable some other tricks that I would tend to consider something of a "hack" as well; for example, this expression evaluates to true:

   1:  typeof(IEnumerable<int>) != typeof(IEnumerable<double>)

This expression is nice because there are some odd class design decisions in places like the PropertyGrid's type structure.  For example, in order to add a PropertyTab to the list of the PropertyGrid's tabs, you need to add a Type to the PropertyTabCollection exposed by the PropertyGrid's PropertyTabs property.  The PropertyGrid caches the tabs that it creates, and so you can't add a single Type to create two property tabs.  Consequently, even if you override the CreateTab method, you can't expect to add two tabs with the same Type.

My solution, then, was to create ExtensionPropertyTab<T>.  This class's type parameter is utterly useless; I create an arbitrary Type using Reflection Emit, close my ExtensionPropertyTab generic type definition with it, and then add the PropertyTab with that closed type.  Works great!  This stuff will be in an upcoming blog post about my PropertyGridEx project.

All of that is leading up to my next hack and, ultimately, my wishlist item #1 for C# 4.0.

There's a simple design-time class called EnumConverter.  This class is the default type converter for all enumeration types; EnumConverter is what displays the Enum names in the property grid when you're choosing items.  I'm creating a type surrogate class that allows you to customize the names of properties, and I've also been working on displaying better values for enumerations.  To this end, I created the EnumTypeConverter<T> class - this class provides enumeration names, but also retrieves friendly names from attributes on each enum entry.  Using generics, I'm able to cache the friendly names so that reflection only needs to be invoked once; System.Enum does something similar.

What I'd like to do, however, is say this:

   1:  public class EnumTypeConverter<T> : TypeConverter where T : enum

C# doesn't allow me to do this.  I get two errors:

error CS1031: Type expected

error CS1001: Identifier expected

So I try using the type name instead:

   1:  public class EnumTypeConverter<T> : TypeConverter where T : Enum

C# doesn't like this either:

error CS0702: Constraint cannot be special class 'System.Enum'

Why?  I get the same error with System.ValueType, even though ultimately it means the same thing as "struct" (though I could understand this difference).  But I can't do this with System.Delegate either (how about calling Invoke() or BeginInvoke() on T?).

Being able to specify Enum as a base would allow me to:

  • Explicitly cast between T and integral numeric types.
  • Specify 0 as the default value of T rather than using default(T).
  • Perform bitwise operations on them

There's really no reason to have a constraint like "Constraint cannot be special class 'System.Enum'."  Let's eliminate this artificial barrier - there shouldn't be any changes needed to be made to the CLR.

15Jan/080

My C# 4.0 Wishlist, Part 2 : Default/Optional Parameters

Posted by Rob

When I was first getting into C# (about .NET 1.0 Beta 2), I saw that it didn't support optional parameters.  The explanation was simple enough: method overloads supported an alternative method of default or optional parameters.  I thought that it was probably a useful choice.  But, check this out:

   1:  public static class MessageBox
   2:  {
   3:      static void Show(string message)
   4:      {
   5:          Show(message, null, MessageBoxButtons.OK, MessageBoxIcon.Information);
   6:      }
   7:   
   8:      static void Show(string message, string title)
   9:      {
  10:          Show(message, title, MessageBoxButtons.OK, MessageBoxIcon.Information);
  11:      }
  12:   
  13:      static void Show(string message, string title, MessageBoxButtons buttons, MessageBoxIcon icon)
  14:      {
  15:          // actually perform the showing
  16:      }
  17:   
  18:      // and more overloads
  19:  }

This is pretty lame, isn't it?  Why can't I just do everything with a single method?

   1:  public static class MessageBox
   2:  {
   3:      static void Show(string message, string title = "", MessageBoxButtons button = MessageBoxButtons.OK, 
   4:          MessageBoxIcon icon = MessageBoxIcon.Information);
   5:  }

So, the question is, how precisely could this work?  Well, let's take a look at how this works in Visual Basic.

   1:  Public Class MessageBox
   2:      Public Shared Sub Show(ByVal message As String, Optional ByVal title As String = "", _
   3:                             Optional ByVal buttons As MessageBoxButtons = MessageBoxButtons.OK, _
   4:                             Optional ByVal icon As MessageBoxIcon = MessageBoxIcon.Information)
   5:      End Sub
   6:  End Class

Visual Basic turns this into a method and annotates the parameter list with attributes.  In C# we could express it like this:

   1:  public static class MessageBox
   2:  {
   3:      static void Show(string message, 
   4:          [Optional]
   5:          [DefaultParameterValue("")]
   6:          string title, 
   7:          [Optional]
   8:          [DefaultParameterValue(MessageBoxButtons.OK)]
   9:          MessageBoxButtons buttons, 
  10:          [Optional]
  11:          [DefaultParameterValue(MessageBoxIcon.Information)]
  12:          MessageBoxIcon icon)
  13:      {
  14:   
  15:      }
  16:  }

In languages that support optional parameters, the compiler provides parameters, so that a call that leaves off optional parameters looks (in IL) like a call that included the parameters.

I suggest that we use a compiler trick - dump the attributes and actually implement the overloads.  This has the awesome benefit of being entirely compiler-dependent and entirely backwards-compatible even to .NET 2.0.  We can even include one overload with the attributes included, so that development tools and compilers that use the attributes can tell the user about the optional parameter information, and the existing compilers can compile against a library using the new methods.

Consider this overload based on the above demonstrated Show method.

   1:  public static class MessageBox
   2:  {
   3:      static void Show(string message, MessageBoxIcon icon)
   4:      {
   5:          Show(message, "", MessageBoxButtons.OK, icon);
   6:      }
   7:  }

This function has the advantage of being inline-able, even if it means a slightly (and I mean ever-so-slightly) hit to file size.  I'm not saying it'd be good or effective to have 10 optional parameters - just that it wouldn't be bad to have a few.

Finally, my suggestion for the syntax of how this all should work out - use the default keyword for each item when you want to specify options:

   1:  void Go()
   2:  {
   3:      MessageBox.Show("This is a test.", default, MessageBoxButtons.OK);
   4:  }

CLI implementation provides an attribute on the actual implementing method so that we can figure out which is the actual default value - the compiler then has the option of whether to call substituting in the actual value (as it's implemented in VB now) or call the correct overload (which is what the current C# compiler would do).