Subtleties of the new Caller Info Attributes in C# 5

UPDATE: Now that the Visual Studio 11 beta has shipped with this feature implemented, I wrote a separate blog post exploring how it actually behaves in these corner cases.

C# 5 is all about asynchronous programming.  However, in additional to the new async features, the C# team managed to slip in a much simpler feature: Caller Info Attributes.

Since C#’s inception, developers have asked for __LINE__ and __FILE__ macros like those in C and C++.  Since C# intentionally does not support macros, these requests have not been answered.  Until now.

C# 5 adds these features using attributes and optional parameters.  As Anders Hejlsberg presented at //Build/, you can write

public static void Log(string message,
    [CallerFilePath] string file = "",
    [CallerLineNumber] int line = 0,
    [CallerMemberName] string member = "")
{
    Console.WriteLine("{0}:{1} – {2}: {3}", 
                      file, line, member, message);

}

If this method is called in C# 5 without specifying the optional parameters, the compiler will insert the file name, line number, and containing member name instead of the default values.  If it’s called in older languages that don’t support optional parameters, those languages will pass the default values, like any other optional method.

These features look trivial at first glance.  However, like most features, they are actually more complicated to design in a solid and robust fashion.  Here are some of the less obvious issues that the C# team needed to deal with when creating this feature:

Disclaimer: I am basing these posts entirely on  logical deduction.  I do not have access to a specification or implementation of this feature; all I know is what Anders announced in his //Build/ presentation.  However, the C# team would have needed to somehow deal with each of thee issues.  Since these attributes are not yet supported by any public CTP, I can’t test my assumptions

To start with, they needed to create new compiler errors if the attribute is applied to an incorrectly-typed parameter, or a non-optional parameter.  Creating compiler errors is expensive; they need to be documented, tested, and localized into every language supported by C#.

The attributes should perhaps support nullable types, or parameter types with custom implicit conversions to int or string.  (especially long or short)

If a method with these attributes is called in an expression tree literal (a lambda expression converted to an Expression<TDelegate>), the compiler would need to insert ConstantExpressions with the actual line number or other info to pass as the parameter.

In addition to being supported in methods, the attributes should also be supported on parameters  for delegate types, allowing you to write

delegate void LinePrinter([CallerLineNumber] int line = 0);

LinePrinter printer = Console.WriteLine;
printer();

Each of the individual attributes also has subtle issues.

[CallerFilePath] seems fairly simple.  Any function call must happen in source code, in a source file that has a path.  However, it needs to take into account #line directives, so that, for example, it will work as expected in Razor views.  I don’t know what it does inside #line hidden, in which there isn’t a source file.

Next time: [CallerLineNumber]

Using a default controller in ASP.Net MVC

One common question about ASP.Net MVC is how to make “default” controller.

Most websites will have a Home controller with actions like About, FAQ, Privacy, or similar pages.  Ordinarily, these actions can only be accessed through URLs like ~/Home/About.  Most people would prefer to put these URLs directly off the root: ~/About, etc.

Unfortunately, there is no obvious way to do that in ASP.Net MVC without making a separate route or controller for each action.

You cannot simply create a route matching "/{action}" and map it to the Home controller, since such a route would match any URL with exactly one term, including URLs meant for other controllers.  Since the routing engine is not aware of MVC actions, it doesn’t know that this route should only match actions that actually exist on the controller.

To make it work, we can add a custom route constraint that forces this route to only match URLs that correspond to actual methods on the controller.

To this end, I wrote an extension method that scans a controller for all action methods and adds a route that matches actions in that controller. The code is available at gist.github.com/1225676.  It can be used like this:

routes.MapDefaultController<Controllers.HomeController>();

This maps the route "/{action}/{id}" (with id optional) to all actions defined in HomeController.   Note that this code ignores custom ActionNameSelectorAttributes (The built-in [ActionName(…)] is supported).

For additional flexibility, you can also create custom routes that will only match actions in a specific controller.  This is useful if you have a single controller with a number of actions that has special route requirements that differ from the rest of your site.

For example:

routes.MapControllerActions<UsersController>(
    name: "User routes",
    url:  "{userName}/{action}"
    defaults: new { action = "Index" }
);

(Note that this example will also match URLs intended for other controllers with the same actions; plan your routes carefully)

XRegExp breaks jQuery Animations

XRegExp is an open source JavaScript library that provides an augmented, extensible, cross-browser implementation of regular expressions, including support for additional syntax, flags, and methods.

It’s used by the popular SyntaxHighlighter script, which is in turn used by many websites (including this blog) to display syntax-highlighted source code on the client.  Thus, XRegExp has a rather wide usage base.

However, XRegExp conflicts with jQuery.  In IE, any page that includes XRegExp and runs a numeric jQuery animation will result in “TypeError: Object doesn't support this property or method”.

Demo (only fails in IE)

This bug is caused by an XRegExp fix for an IE bug in which the exec method doesn't consistently return undefined for nonparticipating capturing groups.  The fix, on line 271, assumes that the parameter passed to exec is a string.  This behavior violates the ECMAScript standard (section 15.10.6.2), which states that the parameter to exec should be converted to a string before proceeding.

jQuery relies on this behavior in its animate method, which parses a number using a regex to get the decimal portion.  (source)

Thus, calling animate with a number after loading XRegExp will fail in IE when XRegExp tries to call slice on a number.

Fortunately, it is very simple to fix XRegExp to convert its argument to a string first:

if (XRegExp) {
    var xExec = RegExp.prototype.exec;
    RegExp.prototype.exec = function(str) {
        if (!str.slice) 
            str = String(str);
        return xExec.call(this, str);
    };
}
Here is an updated demo that uses this fix and works even in IE.

Clarifying Boolean Parameters, part 2

Part 1 is here

Some languages have better ways to pass boolean parameters.  C# 4.0, and all versions of VB, allow parameters to be passed by name.  This allows us to write much clearer code:

//C# 4.0:
UpdateLayout(doFullLayout: false) 
'VB.Net:
UpdateLayout(doFullLayout:=False) 

Without requiring any changes to the function definition, this makes the meaning of the true / false abundantly clear at the call-site.

Javascript offers another interesting alternative.  In Javascript, booleans conditions actually check for “truthyness”.  The statement if(x) will trigger  not just if x is true, but also if x is any “truthy” value, including any object, non-empty string, or non-zero number. Similarly, the expression !x will return false if x is “truthy” and true if x “falsy”.

This means that we can actually use any non-empty string instead of true in Javascript.  Note that this will only work if the function checks the value for “truthyness”; it won’t work for code like if (x === true).

Thus, instead of passing true as a boolean, you can pass a string that describes what you’re actually indicating.

For example:

function updatePosition(animate) {
    //Calculate position
    if (animate)
        //...
    else
        //...
}

$(window).resize(function() {
    updatePosition();
});

updatePosition("With animation");

Although this results in much more readable code, it can be difficult to understand for people who aren’t familiar with this trick.  If the meaning of the parameter changes, you’ll need to hunt down every place that the function is called and change the string to reflect the new meaning.

Finally, unlike an enum, this does not scale to multiple options.  If you need to have more than two options, you should use global variables or objects to simulate an enum, not strings.

Clarifying Boolean Parameters, part 1

Have you ever written code like this:

public void UpdateLayout(bool doFullLayout) {
    //Code
    if (doFullLayout) {
        //Expensive code
    }
    //More code
}

This pattern is commonly used when some operation has a “cheap” mode and an “expensive” mode.  Other code will have calls like UpdateLayout(false) and UpdateLayout(true) scattered throughout.

The problem is that this isn’t very obvious for people who aren’t familiar with the codebase.  If you take a look at a file you’ve never seen before and see calls like UpdateLayout(false) and UpdateLayout(true) scattered, you’ll have no idea what the true / false means.

The simplest solution is to break it out into two methods: UpdateComplexLayout() and UpdateBasicLayout().  However,  if the two different layout modes have intertwined code paths (eg, the code before and after the if above), this either won’t be possible or will lead to ugly duplication of code.

One alternative is to use enums:

public enum LayoutUpdateType {
    Basic,
    Full
}

public void UpdateLayout(LayoutUpdateType type) {
    //Code
    if (type == LayoutUpdateType.Full) {
        //Expensive code
    }
    //More code
}

This way, the callsites are much more descriptive: UpdateLayout(LayoutUpdateType.Full).  This also makes it easy to add more update modes in the future should the need arise.  However, it makes the callsites much more verbose.  When used frequently, this pattern can lead a vast proliferation of enum types that are each only used by one method, polluting the namespace and making more important enums harder to notice.

Next time: Cleverer alternatives

C# is not type-safe

C# is usually touted as a type-safe language.  However, it is not actually fully type-safe!

To examine this claim, we must first provide a strict definition of type-safety  Wikipedia says:

In computer science, type safety is the extent to which a programming language discourages or prevents type errors. A type error is erroneous or undesirable program behavior caused by a discrepancy between differing data types.

To translate this to C#, full type-safety means that any expression that compiles is guaranteed to work at runtime, without causing any invalid cast errors.

Obviously, the cast (and as) operator is an escape hatch from type safety.  It tells the compiler that “I expect this value to actually be of this type, even though you can’t prove it.  If I’m wrong, I’ll live with that”.  Therefore, to be fully type-safe, it must be impossible to get an InvalidCastException at runtime in C# code that does not contain an explicit cast.

Note that parsing or conversion errors (such as any exception from the Convert class) don’t count.  Parsing errors aren’t actually invalid cast errors (instead, they come from unexpected strings), and conversion errors from from cast operations inside the Convert class.  Also, null reference exceptions aren’t cast errors. 

So, why isn’t C# type-safe?

MSDN says that InvalidCastException is thrown in two conditions:

  • For a conversion from a Single or a Double to a Decimal, the source value is infinity, Not-a-Number (NaN), or too large to be represented as the destination type.

  • A failure occurs during an explicit reference conversion.

Both of these conditions can only occur from a cast operation, so it looks like C# is in fact type safe.

Or is it?

IEnumerable numbers = new int[] { 1, 2, 3 };

foreach(string x in numbers) 
    ;

This code compiles (!). Running it results in

InvalidCastException: Unable to cast object of type 'System.Int32' to type 'System.String'.

On the foreach line.

Since we don’t have any explicit cast operations (The implicit conversion from int[] to IEnumerable is an implicit conversion, which is guaranteed to succeed) , this proves that C# is not type-safe.

What happened?

The foreach construct comes from C# 1.0, before generics existed.  It worked with untyped collections such as ArrayList or IEnumerable.  Therefore, the IEnumerator.Current property that gets assigned to the loop variable would usually be of type object.   (In fact, the foreach statement is duck-typed to allow the enumerator to provide a typed Current property, particularly to avoid boxing). 

Therefore, you would expect that almost all (non-generic) foreach loops would need to have the loop variable declared as object, since that’s the compile-time type of the items in the collection.  Since that would be extremely annoying, the compiler allows you to use any type you want, and will implicitly cast the Current values to the type you declared.  Thus, mis-declaring the type results in an InvalidCastException.

Note that if the foreach type isn’t compatible at all with the type of the Current property, you will get a compile-time error (just like (string)42 doesn’t compile).  Therefore, if you stick with generic collections, you’re won’t get these runtime errors (unless you declare the foreach as a subtype of the item type).

C# also isn’t type-safe because of array covariance.

string[] strings = new string[1];
object[] arr = strings;
arr[0] = 7;

This code compiles, but throws “ArrayTypeMismatchException: Attempted to access an element as a type incompatible with the array.” at run-time.

As Eric Lippert explains, this feature was added in order to be more compatible with Java.

Delegates vs. Function Pointers, Addendum: Multicast Delegates

Until now, I've been focusing on only one of the differences between delegates and function pointers; namely, associated state.
Delegates have one other capability that function pointers do not.  A single function pointer can only point to one function.  .Net, on the other hand, supports multicast delegates – delegates that point to multiple functions.  You can combine two existing delegates using the + operator (or by calling Delegate.Combine) to create a single new delegate instance that points two all of the methods in the original two delegates.  This new delegate stores all of the methods from the original two delegates in a private array of delegates called InvocationList (the delegates in this array are ordinary non-multicast delegates that each only point to a single method). 

Note that delegates, like strings, are immutable.  Adding two delegates together creates a third delegate containing the methods from the first two; the original delegate instances are not affected.  For example, writing delegateField += SomeMethod creates a new delegate instance containing the methods originally in delegateField as well as SomeMethod, then stores this new instance in delegateField.

Similarly, the - operator (or Delegate.Remove) will remove the second operand from the first one (again, returning a new delegate instance).  If the second operand has multiple methods, all of them will be removed from the final delegate.  If some of the methods in the second operand appear multiple times in the original delegate, only the last occurrence of each one will be removed (the one most recently added).  The RemoveAll method will remove all occurrences.  If all of the methods were removed, it will return null; there is no such thing as an empty delegate instance.

Multicast delegates are not intended to be used with delegates that return values.  If you call a non-void delegate that contains multiple methods, it will return the return value of the last method in the delegate.  If you want to see the return values of all of the methods, you’ll need to loop over GetInvocationList() and call each delegate individually.

Multicast delegates also don’t play well with the new covariant and contravariant generic delegates in .Net 4.0.  You cannot combine two delegates unless their types match exactly, including variant generic parameters.

Function pointers cannot easily be combined the way multicast delegates can.  The only way to combine function pointers without cooperation from the code that calls the pointer is to make a function that uses a closure to call all of the function pointers you want to call.

In Javascript, that would look like this:

function combine() {
    var methods = arguments;

    return function() { 
        var retVal;
        for(var i = 0; i < methods.length; i++) 
            retVal = methods[i].apply(this, arguments);
        return retVal;
    };
}