C# is not type-safe

C# is usually touted as a type-safe language.  However, it is not actually fully type-safe!

To examine this claim, we must first provide a strict definition of type-safety  Wikipedia says:

In computer science, type safety is the extent to which a programming language discourages or prevents type errors. A type error is erroneous or undesirable program behavior caused by a discrepancy between differing data types.

To translate this to C#, full type-safety means that any expression that compiles is guaranteed to work at runtime, without causing any invalid cast errors.

Obviously, the cast (and as) operator is an escape hatch from type safety.  It tells the compiler that “I expect this value to actually be of this type, even though you can’t prove it.  If I’m wrong, I’ll live with that”.  Therefore, to be fully type-safe, it must be impossible to get an InvalidCastException at runtime in C# code that does not contain an explicit cast.

Note that parsing or conversion errors (such as any exception from the Convert class) don’t count.  Parsing errors aren’t actually invalid cast errors (instead, they come from unexpected strings), and conversion errors from from cast operations inside the Convert class.  Also, null reference exceptions aren’t cast errors. 

So, why isn’t C# type-safe?

MSDN says that InvalidCastException is thrown in two conditions:

  • For a conversion from a Single or a Double to a Decimal, the source value is infinity, Not-a-Number (NaN), or too large to be represented as the destination type.

  • A failure occurs during an explicit reference conversion.

Both of these conditions can only occur from a cast operation, so it looks like C# is in fact type safe.

Or is it?

IEnumerable numbers = new int[] { 1, 2, 3 };

foreach(string x in numbers) 
    ;

This code compiles (!). Running it results in

InvalidCastException: Unable to cast object of type 'System.Int32' to type 'System.String'.

On the foreach line.

Since we don’t have any explicit cast operations (The implicit conversion from int[] to IEnumerable is an implicit conversion, which is guaranteed to succeed) , this proves that C# is not type-safe.

What happened?

The foreach construct comes from C# 1.0, before generics existed.  It worked with untyped collections such as ArrayList or IEnumerable.  Therefore, the IEnumerator.Current property that gets assigned to the loop variable would usually be of type object.   (In fact, the foreach statement is duck-typed to allow the enumerator to provide a typed Current property, particularly to avoid boxing). 

Therefore, you would expect that almost all (non-generic) foreach loops would need to have the loop variable declared as object, since that’s the compile-time type of the items in the collection.  Since that would be extremely annoying, the compiler allows you to use any type you want, and will implicitly cast the Current values to the type you declared.  Thus, mis-declaring the type results in an InvalidCastException.

Note that if the foreach type isn’t compatible at all with the type of the Current property, you will get a compile-time error (just like (string)42 doesn’t compile).  Therefore, if you stick with generic collections, you’re won’t get these runtime errors (unless you declare the foreach as a subtype of the item type).

C# also isn’t type-safe because of array covariance.

string[] strings = new string[1];
object[] arr = strings;
arr[0] = 7;

This code compiles, but throws “ArrayTypeMismatchException: Attempted to access an element as a type incompatible with the array.” at run-time.

As Eric Lippert explains, this feature was added in order to be more compatible with Java.

Delegates vs. Function Pointers, Addendum: Multicast Delegates

Until now, I've been focusing on only one of the differences between delegates and function pointers; namely, associated state.
Delegates have one other capability that function pointers do not.  A single function pointer can only point to one function.  .Net, on the other hand, supports multicast delegates – delegates that point to multiple functions.  You can combine two existing delegates using the + operator (or by calling Delegate.Combine) to create a single new delegate instance that points two all of the methods in the original two delegates.  This new delegate stores all of the methods from the original two delegates in a private array of delegates called InvocationList (the delegates in this array are ordinary non-multicast delegates that each only point to a single method). 

Note that delegates, like strings, are immutable.  Adding two delegates together creates a third delegate containing the methods from the first two; the original delegate instances are not affected.  For example, writing delegateField += SomeMethod creates a new delegate instance containing the methods originally in delegateField as well as SomeMethod, then stores this new instance in delegateField.

Similarly, the - operator (or Delegate.Remove) will remove the second operand from the first one (again, returning a new delegate instance).  If the second operand has multiple methods, all of them will be removed from the final delegate.  If some of the methods in the second operand appear multiple times in the original delegate, only the last occurrence of each one will be removed (the one most recently added).  The RemoveAll method will remove all occurrences.  If all of the methods were removed, it will return null; there is no such thing as an empty delegate instance.

Multicast delegates are not intended to be used with delegates that return values.  If you call a non-void delegate that contains multiple methods, it will return the return value of the last method in the delegate.  If you want to see the return values of all of the methods, you’ll need to loop over GetInvocationList() and call each delegate individually.

Multicast delegates also don’t play well with the new covariant and contravariant generic delegates in .Net 4.0.  You cannot combine two delegates unless their types match exactly, including variant generic parameters.

Function pointers cannot easily be combined the way multicast delegates can.  The only way to combine function pointers without cooperation from the code that calls the pointer is to make a function that uses a closure to call all of the function pointers you want to call.

In Javascript, that would look like this:

function combine() {
    var methods = arguments;

    return function() { 
        var retVal;
        for(var i = 0; i < methods.length; i++) 
            retVal = methods[i].apply(this, arguments);
        return retVal;
    };
}

Tracking Event Handler Registrations

When working with large .Net applications, it can be useful to find out where event handlers are being registered, especially in an unfamiliar codebase.

In simple cases, you can do this by right-clicking the event definition and clicking Find All References (Shift+F12).  This will show you every line of code that adds or removes a handler from the event by name.  For field-like (ordinary) events, this will also show you every line of code that raises the event.

However, this isn’t always good enough.  Sometimes, event handlers are not added by name.  The .Net data-binding infrastructure, as well as the CompositeUI Event Broker service, will add and remove event handlers using reflection, so they won’t be found by Find All References.  Similarly, if an event handler is added by an external DLL, Find All References won’t find it.

For these scenarios, you can use a less-obvious trick.  As I described last time, adding or removing an event handler actually executes code inside of an accessor method. Like any other code, we can set a breakpoint to see where the code is executed.

For custom events, this is easy.  Just add a breakpoint in the add and/or remove accessors and run your program.  Whenever a handler is added or removed, the debugger will break into the accessor, and you can look at the callstack to determine where it’s coming from.

However, most events are field-like, and don’t have actual source code in their accessor methods.  To set a breakpoint in a field-like event, you need to use a lesser-known feature: function breakpoints (Unfortunately, this feature is not available in Visual Studio Express).  You can click Debug, New Breakpoint, Break at Function (Ctrl+D, N) to tell the debugger to pause whenever a specific managed function is executed.

To add a breakpoint at an event accessor, type Namespace.ClassName.add_EventName.  To ensure that you entered it correctly, open the Debug, Breakpoints window (Ctrl+D, B) and check that the new breakpoint says break always (currently 0) in the Hit Count column.  If it doesn’t say (currently 0), then either the assembly has not been loaded yet or you made a typo in the location (right-click the breakpoint and click Location).

About .Net Events

A .Net event actually consists of a pair of accessor methods named add_EventName and remove_EventName.  These functions each take a handler delegate, and are expected to add or remove that delegate from the list of event handlers. 

In C#, writing public event EventHandler EventName; creates a field-like event.  The compiler will automatically generate a private backing field (also a delegate), along with thread-safe accessor methods that add and remove handlers from the backing field (like an auto-implemented property).  Within the class that declared the event, EventName refers to this private backing field.  Thus, writing EventName(...) in the class calls this field and raises the event (if no handlers have been added, the field will be null).

You can also write custom event accessors to gain full control over how handlers are added to your events.   For example, this event will store and trigger handlers in reverse order:

void Main()
{
    ReversedEvent += delegate { Console.WriteLine(1); };
    ReversedEvent += delegate { Console.WriteLine(2); };
    ReversedEvent += delegate { Console.WriteLine(3); };

    OnReversedEvent();
}

protected void OnReversedEvent() {
    if (reversedEvent != null)
        reversedEvent(this, EventArgs.Empty);
}

private EventHandler reversedEvent;
public event EventHandler ReversedEvent {
    add {
        reversedEvent = value + reversedEvent;
    }
    remove {
        reversedEvent -= value;
    }
}

This add accessor uses the non-commutative delegate addition operator to prepend each new handler to the delegate field containing the existing handlers.  The raiser method simply calls the combined delegate in the private field. (which is null if there aren’t any handlers)

Note that this code is not thread-safe.  If two threads add a handler at the same time, both of them will read the original storage field, add their respective handlers to create a new delegate instance, then write this new delegate back to the field.  The thread that writes back to the field last will overwrite the changes made by the other thread, since it never saw the other thread’s handler (this is the same reason that x += y is not thread-safe).  The accessors generated by the compiler are threadsafe, either by using lock(this) (C# 3 or earlier) or a lock-free threadsafe implementation (C# 4).  For more details, see this series of blog posts.

This example is rather useless.  However, there are better reasons to create custom event accessors. WinForms controls store their events in a special EventHandlerList class to save memory.  WPF controls create events using the Routed Event system, and store handlers in special storage in DependencyObject.  Custom event accessors can also be used to perform validation or logging.

Creating Local Extension Methods

Sometimes, it can be useful to make an extension method specifically for a single block of code.  Unfortunately, since extension methods cannot appear in nested classes, there is no obvious way to do that.

Instead, you can create a child namespace containing the extension method.  In order to limit the extension method’s visibility to a single method, you can put that method in a separate namespace block.  This way, you can add a using statement to that namespace alone.

For example:

namespace Company.Project {
    partial class MyClass {
        ...
    }
}
namespace Company.Project {
    using MyClassExtensions;
    namespace MyClassExtensions {
        static class Extensions {
            public static string Name<T>(this T obj) {
                if (default(T) == null && Equals(obj, default(T)))
                    return "(null " + typeof(T) + ")";
                return obj.GetType() + ": " + obj.ToString() 
                     + "{declared as " + typeof(T) + "}";
            }
        }
    }
    partial class MyClass {
        void DoSomething() {
            object x = new DateTime();
            string name = x.Name();
        }
    }
}

Since the using MyClassExtensions statement appears inside the second namespace block, the extension methods are only visible within that block.  Code that uses these extension method can appear in this second block, while the rest of the class can go in the original namespace block without the extension methods.

This technique should be avoided where possible, since it leads to confusing and non-obvious code.  However, there are situations in which this can make some code much more readable.

Don’t modify other controls during a WPF layout pass

Unlike WinForms or native Win32 development, WPF provides a rich layout model which allows developers to easily create complicated UIs that resize to fit their contents or the parent window.

However, when developing custom controls, it can be necessary to layout child controls manually by overriding the MeasureOverride and ArrangeOverride methods.  To quote MSDN,

Measure allows a component to determine how much size it would like to take. This is a separate phase from Arrange because there are many situations where a parent element will ask a child to measure several times to determine its optimal position and size. The fact that parent elements ask child elements to measure demonstrates another key philosophy of WPF – size to content. All controls in WPF support the ability to size to the natural size of their content. This makes localization much easier, and allows for dynamic layout of elements as things resize. The Arrange phase allows a parent to position and determine the final size of each child.

Overriding these methods gives your custom control full power over the layout of its child element(s).

Be careful what you do when overriding these methods.  Any code in MeasureOverride or ArrangeOverride runs during the WPF layout passes.  in these methods, you should not modify any part of the visual tree outside of the control you’re overriding in.  If you do, you’ll be changing the visuals between Measure() and Arrange(), which will have unexpected results.

It is safe to modify your own child controls during the layout pass.  Before you call Measure() on a child control, its layout pass has not started.  Therefore, any changes will be seen by the child’s layout code.  Similarly, after you Arrange() a child control, its layout pass is finished, so it is safe to modify again (although you may end up triggering another layout pass to see the changes).

If you do need to modify an outside control during the layout pass, you should call Dispatcher.BeginInvoke() to run code asynchronously during the next message loop.  This way, your code will run after the layout pass finishes, and it will be able to safely modify whatever it wants.

Note that Measure() can be called multiple times during a single layout pass (if a parent needs to iteratively determine the best fit for a child).

Delegates vs. Function Pointers, part 5: Javascript

This is part 5 in a series about state and function pointers; part 1 is here.

Last time, we saw how C# 2 supports closures by compiling anonymous functions into member functions of a special class that holds local state from the outer function. 

Unlike the languages we’ve looked at before, Javascript has had closures baked in to the languages since its inception.  My standard example can be achieved very simply in Javascript:

var x = 2;
var numbers = [ 1, 2, 3, 4 ];
var hugeNumbers = numbers.filter(function(n) { return n > x; });

This code uses the Array.filter method, new to Javascript 1.6, to create a new array with those elements from the first array that pass a callback.  The function expression passed to filter captures the x variable for use inside the callback.

This looks extremely similar to the C# 2.0 version from last time.  However. under the covers, it’s rather different.

Like .Net managed instance methods, all Javascript functions take a hidden this parameter.  However, unlike .Net, Javascript does not have delegates.  There is no (intrinsic) way to bind an object to the this parameter the way a .Net closed delegate does.  Instead, the this parameter comes from the callsite, depending on how the function was called.  Therefore, we cannot pass state in the this parameter the way we did in C#.

Instead, all Javascript function expressions capture the variable environment of the scope that they are declared in as a hidden property of the function.  Therefore, a function can reference local variables from its declaring scope.  Unlike C#, which binds functions to their parent scopes using a field in a separate delegate object that points to the function, Javascript functions have their parent scopes baked in to the functions themselves. 

Javascript doesn’t have separate delegate objects that can hold a function and a this parameter.  Instead, the value of the this parameter is determined at the call-site, depending on how the function was called.  This is a common source of confusion to inexperienced Javascript developers.

To simulate closed delegates, we can make a method that takes a function as well as a target object to call it on, and returns a new function which calls the original function with this equal to the target parameter.  That sounds overwhelmingly complicated, but it’s actually not that hard:

function createDelegate(func, target) {
    return function() { 
        return func.apply(target, arguments);
    };
}

var myObject = { name: "Target!"};
function myMethod() {
    return this.name;
}

var delegate = createDelegate(myMethod, myObject);
alert(delegate());

This createDelegate method returns a function expression that captures the func and target parameters, and calls func in the context of target.  Instead of storing the target in a property of a Delegate object (like .Net does), this code stores it in the inner function expression’s closure.

Javascript 1.8.5 provides the Function.bind method, which is equivalent to this createDelegate method, with additional capabilities as well.  In Chrome, Firefox 4, and IE9, you can write

var myObject = { name: "Target!"};
function myMethod() {
    return this.name;
}

var delegate = myMethod.bind(myObject);
alert(delegate());
For more information, see the MDN documentation.