Visual Studio 2012 and webpages:Version

If you open an older ASP.Net MVC3 project in Visual Studio 2012, you may see lots of errors in the Razor views, along the lines of “The name 'model' does not exist in the current context”, and similar errors whenever you try to use MVC features like HTML helpers or ViewContext (eg, “System.Web.WebPages.Html.HtmlHelper does not contain a definition for TextBoxFor”).

This happens if there is no <add key="webpages:Version" value="1.0" /> in the <appSettings> element in Web.config.

Without this element, Visual Studio will assume that you’re using the latest version of Razor and the WebPages framework.  Until VS2012, this wasn’t a problem, since there was only one version.  However, since VS2012 ships with ASP.Net WebPages 2.0, the IDE will load this version by default.  Since you’ve specified the MVC integration & <configSection> for 1.0 (in Views/Web.config, since all of the assembly references specify Version=1.0.0.0), the language services won’t load the MVC settings.  Therefore, the @model directive will not work, and the view will inherit the standard WebPage base class rather than the MVC WebViewPage base class (which contains the MVC HTML helpers)

This issue has no effect at runtime because the server won’t load any version 2.0 assemblies.

Exploring Caller Info Attributes

Last year, Microsoft announced a simple new feature in C# 5: Caller Info Attributes.  These attributes let you to create methods with optional parameters and tell the compiler to pass the caller’s filepath, line number, or member name instead of the parameter’s default value.  This allows you to create logging methods that automatically know where they’re being called.

When the feature was announced, I wrote a couple of blog posts that delved into some of the corner cases of the new feature.  At the time, there was no public implementation, so they were pure conjecture.

This morning, Microsoft released the beta of Visual Studio 11, which is the first public build supporting these attributes.  Now, I can finally test my theories.  Here are the results:

Although these classes are new to the .Net Framework 4.5, you can still use this feature against older framework versions by creating your own classes in the System.Runtime.CompilerServices namespace.  However, the feature will only work if the code calling the method is compiled with the C# 5 compiler; older compilers will ignore the attributes and simply pass the parameters’ default values.

All of the attributes can only be applied to arguments of types that have standard (not custom) implicit conversions to int or string.  This means that it isn’t practical to overflow [CallerLineNumber] (the compiler ran out of memory first), so I can’t test how that behaves.

Using [CallerMemberName] on field initializers passes the field name, and on static or instances constructors passes the string ".cctor" or ".ctor" (as documented)  In indexers, it passes "Item".

If a class has a constructor that takes only caller info attribute parameters, and you create another class that inherits it and does not declare a constructor (thus implicitly passing optional parameters), it passes the line number and file name of the class keyword in the derived class, but leaves the declared default for the member name (I suspect that’s a bug). 

If you do declare a constructor, it passes the string ".ctor" as the member name for the implicit base() call (just like a normal method call from inside a constructor) and the line number of the beginning of the constructor declaration.  If you actually write a base() call, it passes the line number of the base keyword.

If a call spans multiple lines, [CallerLineNumber] passes the line containing the openning parenthesis.

Delegates are fully supported; if you call a delegate that has an argumented annotated with a caller info attribute, the compiler will insert the correct value, regardless of the method you’re actually calling (which the compiler doesn’t even know).

LINQ query comprehension syntax is not supported at all; if you create a (for example) Select() method that contains a caller info attribute, then call it from a LINQ query (not lambda syntax), the compiler will crash (!).  (they will fix that)

Expression trees do not support optional parameters at all, so that corner case is irrelevant.

Attributes are the most interesting story.  What should happen if you declare a custom attribute that takes parameters with caller info attributes, then apply that attribute in various cases?  This could potentially be very useful, since there is currently no way for an attribute to know what it’s being applied to. (I hadn’t thought of this usage when I wrote the original blog post)

The documentation says that this will work in all cases, and that [CallerMemberName] will pass whatever the attribute is being applied to.  However, in the beta build, this doesn’t always work.

Attributes applied to method arguments or return values do not pass any caller info at all.  Attributes applied to types or generic type arguments do not pass member names (this is very disappointing)

Hopefully, those will be fixed before release.

ASP.Net MVC Unobtrusive Validation Bug

If you use the ASP.Net MVC 3 [Compare] validation attribute on a model property, then include that model as a property in a parent model (so that the field name becomes Parent.ChildProperty), the built-in unobtrusive client validation will choke, and will always report the field as having an error.

This is due to a bug on line 288 of jquery.validate.unobtrusive.js:

adapters.add("equalto", ["other"], function (options) {
    var prefix = getModelPrefix(options.element.name),
        other = options.params.other,
        fullOtherName = appendModelPrefix(other, prefix),
        element = $(options.form).find(":input[name=" + fullOtherName + "]")[0];

    setValidationValues(options, "equalTo", element);
});

Because the value of the name attribute selector is not quoted, this fails if the name contains a ..

The simplest fix is to add quotes around the concatenated value.  However, the jQuery selector there is overkill.  HTML form elements have properties for each named input, so you can do this instead:

adapters.add("equalto", ["other"], function (options) {
    var prefix = getModelPrefix(options.element.name),
        other = options.params.other,
        fullOtherName = appendModelPrefix(other, prefix),
        element = options.form[fullOtherName];
    if (!element)
        throw new Error(fullOtherName + " not found");
    //If there are multiple inputs with that name, get the first one
    if (element.length && element[0])        
        element = element[0];
    setValidationValues(options, "equalTo", element);
});

Protecting against CSRF attacks in ASP.Net MVC

CSRF attacks are one of the many security issues that web developers must defend against.  Fortunately, ASP.Net MVC makes it easy to defend against CSRF attacks.  Simply slap on [ValidateAntiForgeryToken] to every POST action and include @Html.AntiForgeryToken() in every form, and your forms will be secure against CSRF.

However, it is easy to forget to apply [ValidateAntiForgeryToken] to every action.  To prevent such mistakes, you can create a unit test that loops through all of your controller actions and makes sure that every [HttpPost] action also has [ValidateAntiForgeryToken]. 

Since there may be some POST actions that should not be protected against CSRF, you’ll probably also want a marker attribute to tell the test to ignore some actions.

This can be implemented like this:

First, define the marker attribute in the MVC web project.  This attribute can be applied to a single action, or to a controller to allow every action in the controller.

///<summary>Indicates that an action or controller deliberately 
/// allows CSRF attacks.</summary>
///<remarks>All [HttpPost] actions must have 
/// [ValidateAntiForgeryToken]; any deliberately unprotected 
/// actions must be marked with this attribute.
/// This rule is enforced by a unit test.</remarks>
[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method)]
public sealed class AllowCsrfAttacksAttribute : Attribute { }

Then, add the following unit test:

[TestMethod]
public void CheckForCsrfProtection() {
    var controllers = typeof(MvcApplication).Assembly.GetTypes().Where(typeof(IController).IsAssignableFrom);
    foreach (var type in controllers.Where(t => !t.IsDefined(typeof(AllowCsrfAttacksAttribute), true))) {
        var postActions = type.GetMethods()
                                .Where(m => !m.ContainsGenericParameters)
                                .Where(m => !m.IsDefined(typeof(ChildActionOnlyAttribute), true))
                                .Where(m => !m.IsDefined(typeof(NonActionAttribute), true))
                                .Where(m => !m.GetParameters().Any(p => p.IsOut || p.ParameterType.IsByRef))
                                .Where(m => m.IsDefined(typeof(HttpPostAttribute), true));

        foreach (var action in postActions) {
            //CSRF XOR AntiForgery
            Assert.IsTrue(action.IsDefined(typeof(AllowCsrfAttacksAttribute), true) != action.IsDefined(typeof(ValidateAntiForgeryTokenAttribute), true),
                            action.Name + " is [HttpPost] but not [ValidateAntiForgeryToken]");
        }
    }
}
typeof(MvcApplication) must be any type in the assembly that contains your controllers.  If your controllers are defined in multiple assemblies, you’ll need to include those assemblies too.

The Dark Side of Covariance

What’s wrong with the following code?

var names = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
...
if (names.Contains(sqlCommand.ExecuteScalar())

This  code is intended to check whether the result of a SQL query is contained in a case-insensitive collection of names.  However, if you run this code, the resulting check will be case-sensitive.  Why?

As you may have guessed from the title, this is caused by covariance.  In fact, this code will not compile at all against .Net 3.5. 

The problem is that ExecuteScalar() returns object, not string.  Therefore, it doesn’t call HashSet<string>.Contains(string), which is what it’s intending to call (and which uses the HashSet’s comparer).  Instead, on .Net 4.0, this calls the  Enumerable.Contains<object>(IEnumerable<object>, string) extension method, using the covariant conversion from IEnumerable<string> to IEnumerable<object>.  Covariance allows us to pass object to the Contains method of any strongly-typed collection (of reference types).

Still, why is it case-sensitive?  As Jon Skeet points out, the LINQ Contains() method is supposed to call any built-in Contains() method from ICollection<T>, so it should still use the HashSet’s case-insensitive Contains().

The reason is that although HashSet<String> implements ICollection<string>, it does not implement ICollection<object>.  Since we’re calling Enumerable.Contains<object>, it checks whether the sequence implements ICollection<object>, which it doesn’t.  (ICollection<T> is not covariant, since it allows write access)

Fortunately, there’s a simple fix: just cast the return value back to string (and add a comment explaining the problem).  This allows the compiler to call HashSet<string>.Contains(string), as was originally intended.

//Call HashSet<string>.Contains(string), not the
//covariant Enumerable.Contains(IEnumerable<object>, object)
//http://blog.slaks.net/2011/12/dark-side-of-covariance.html
if (names.Contains((string)sqlCommand.ExecuteScalar())
(I discovered this issue in my StringListConstraint for ASP.Net MVC)

CAPTCHAs do not mitigate XSS worms

One common misconception about web security is that protecting important actions with CAPTCHAs can prevent XSS attacks from doing real damage.  By preventing malicious code from scripting critical tasks, the idea goes, XSS injections won’t be able to accomplish much.

This idea is dangerously wrong. 

First of all, this should not even be considered except as a defense-in-depth mechanism.  Regardless of whether the actions you care about are protected by CAPTCHAs, XSS attacks can create arbitrary UI on your pages, and can thus make “perfect” phishing attacks.

Also, even with CAPTCHAs, an XSS injection can wait until the user performs the critical action, then change the submitted data to the attacker’s whim.

For example, if Twitter took this approach to prevent XSS injections from sending spammy tweets, the attacker could simply wait until the user sends a real tweet, then silently append advertising to the tweet as the user submits it and fills out the CAPTCHA.

However, there is also a more fundamental issue.  Since the injected Javascript is running in the user’s browser, it simply display the CAPTCHA to the user and block all page functionality until the user solves the CAPTCHA.  The attacker can even put his own text around the CAPTCHA to look like a legitimate security precaution, so that the (typical) user will not realize that the site has been compromised.  (that could be prevented by integrating a description of the action being performed into the CAPTCHA itself in a way that the attacker can’t hide)

I haven’t even mentioned the inconvenience of forcing all legitimate, uncompromised users to fill out CAPTCHAs every time they do anything significant.

In summary, CAPTCHAs should only be used to prevent programs from automatically performing actions (eg, bulk-registering Google accounts), and as a rate-limiter if a user sends too many requests too quickly (eg, getting a password wrong too many times in a row).

XSS can only be stopped by properly encoding all user-generated content that gets concatenated into markup (whether HTML, Javascript, or CSS)

About Concurrent Collections

One of the most useful additions to the .Net 4.0 base class library is the System.Collections.Concurrent namespace, which contains an all-new set of lock-free thread.

However, these collections are noticeably different from their classical counterparts.  There is no simple ConcurrentList<T> that you can drop into your code so that it will become thread-safe.  Instead, the new namespace has a queue, a stack, and some new thing called a bag, as well as ConcurrentDictionary<TKey, TValue> that largely resembles classical dictionaries.  It also has a BlockingCollection<T> class that wraps a concurrent collection and blocks until operations can succeed.

Many people have complained that Microsoft chose to provide an entirely new set of classes, rather than adding synchronized versions of the existing collections. 

In truth, however, creating this new set of classes is the correct – and, in fact, only – choice.  Ordinary synchronized are rarely useful and will not make a program thread-safe.  In general, slapping locks everywhere does not make a program thread-safe!  Rather, that will either not help (if there aren’t enough locks) or, if there are enough locks, result in deadlocks or a program that never runs more than one thread at a time.  (depending on how many different objects get locked)

Collections in particular have two fundamental issues when used on multiple threads.

The first, and simpler, issue is that the collection classes themselves are not thread-safe.  If two threads add to a List<T> at the same exact time, one thread is likely to overwrite the other thread’s value.  A synchronized version of List<T> with a lock around every method would solve this problem.

The bigger issue is that any code that uses a List<T> is unlikely to be thread-safe, even if the list itself is thread-safe.  For example. you can never enumerate over a multi-threaded list, because another thread may change the list at any time, invalidating the enumerator.  This issue could be solved by taking a read lock for the lifetime of the enumerator.  However, that is also a bad idea, since if any client code forgets to dispose the enumerator, the collection will deadlock when written to.

You also cannot use indices.  It is never safe to get, set, or remove an item at an index, because another thread might remove that item or clear the entire collection between your index check and the operation.

To solve all of these problems, you need thread-safe collections that provide atomic operations.  This is why all of the concurrent collections have such strange methods, including TryPop, AddOrUpdate, and TryTake.  These methods perform their operations atomically, and return false if the collection was empty (as appropriate; consult the documentation for actual detail).  Thus, the new concurrent collections can be used reliably in actual multi-threaded code without a separate layer of locks.

Beware of Response.RedirectToRoute in MVC 3.0

ASP.Net MVC uses the new (to ASP.Net 3.5) Http*Base wrapper classes (HttpContextBase, HttpRequestBase, HttpResponseBase, etc) instead of the original Http* classes.  This allows you to create mock implementations that inherit the Http*Base classes without an actual HTTP request.  This is useful for unit testing, and for overriding standard behaviors (such as route checking).

In ordinary MVC code, the HttpContext, Request, and Response properties will return Http*Wrapper instances that directly wrap the original Http* classes (eg, HttpContextWrapper, which wraps HttpContext).  Most MVC developers use the HttpContext and related properties without being aware of any of this redirection.

Until you call Response.RedirectToRoute.  This method, which is new to .Net 4.0, redirects the browser to a URL for a route in the new ASP.Net routing engine.  Like other HttpResponse methods, HttpResponseBase has its own version of this method for derived classes to override.

However, in .Net 4.0, Microsoft forgot to override this method in the standard HttpResponseWrapper.  Therefore, if you call Response.RedirectToRoute in an MVC application (where Response is actually an HttpResponseWrapper), you’ll get a NotImplementedException.

You can see this oversight in the methods list for HttpResponseWrapper.  Every method except for RedirectToRoute and RedirectToRoutePermanent are list as (Overrides HttpResponseBase.MethodName().); these methods are listed as (Inherited from HttpResponseBase.MethodName().)

To work around this issue, you can either use the original HttpResponse by writing HttpContext.Current.Response.RedirectToRoute(…) or by calling Response.Redirect instead.

Note that most MVC applications should not call Response.Redirect or Response.RedirectToRoute at all; instead, they should return ActionResults by calling helper methods like return Redirect(…); or return RedirectToAction(…);

In the upcoming ASP.Net 4.5 release, these methods have been properly overridden.

Caller Info Attributes vs. Stack Walking

People sometimes wonder why C# 5 needs to add caller info attributes, when this information is already available by using the StackTrace class.  In reality, caller info attributes behave rather differently from the StackTrace class, for a number of reasons.

Advantages to Caller Info Attributes

The primary reason to use caller info attributes is that they’re much faster.  Stack walking is one of the slowest internal (as opposed to network IO) things you can do in .Net (disclaimer: I haven’t measured).  By contrast, caller info attributes have exactly 0 performance penalty.  Caller info is resolved at compile-time; the callsite is compiled to pass a string or int literal that was determined by the compiler.  Incidentally, this is why C# 5 doesn’t have [CallerType] or [CallerMemberInfo] attributes; the compiler team wasn’t happy with the performance of the result IL and didn’t have time to implement proper caching to make it faster.

Caller info attributes are also more reliable than stack walking.  Stack walking will give incorrect results if the JITter decides to inline either your method or its caller.  Since caller info attributes are resolved at compile-time, they are immune to inlining.  Also, stack walking can only give line number and filename info if the calling assembly’s PDB file is present at runtime, whereas [CallerFileName] and [CallerLineNumber] will always work. 

Caller info attributes are also unaffected by compiler transformations.  If you do a stack walk from inside a lambda expression, iterator, or async method, you’ll see the compiler-generated method; there is no good way to retrieve the name of the original method.  Since caller info attributes are processed by the compiler, [CallerMemberName] should correctly pass the name of the containing method (although I cannot verify this).  Note that stack walking will retrieve the correct line number even inside lambda expressions (assuming there is a PDB file)

Advantages To Stack Walking

There are still a couple of reasons to use the StackTrace class.  Obviously, if you want to see more than one frame (if you want your caller’s caller), you need to use StackTrace.  Also, if you want to support clients in languages that don’t feature caller info attributes, such as C# 4 or F#, you should walk the stack instead.

Stack walking is the only way to find the caller’s type or assembly (or a MemberInfo object), since caller info attributes do not provide this information.

Stack walking is also the only choice for security-related scenarios.  It should be obvious that caller info attributes can trivially be spoofed; nothing prevents the caller from passing arbitrary values as the optional parameter.  Stack walking, by contrast, happens within the CLR and cannot be spoofed. 

Note that there are some other concerns with stack walking for security purposes.

  • Filename and line number information can be spoofed by providing a modified PDB file
  • Class and function names are meaningless; your enemy can name his function whatever he wants to.  All you should rely on is the assembly name.
  • There are various ways for an attacker to cause your code to be called indirectly (eg, DynamicInvoking a delegate) so that his assembly isn’t the direct caller.  In fact, if the attacker invokes your delegate on a UI thread (by calling BeginInvoke), his assembly won’t show up on the callstack at all.  The attacker can also compile a dynamic assembly that calls your function and call into that assembly.
  • As always, beware of inlining

Subtleties of C# 5’s new [CallerMemberName]

UPDATE: Now that the Visual Studio 11 beta has shipped with this feature implemented, I wrote a separate blog post exploring how it actually behaves in these corner cases.

Last time, I explored various pathological code samples in which the [CallerLineNumber] attribute does not have obvious behavior.  This time, I’ll cover the last of these new caller info attributes: [CallerMemberName].

The [CallerMemberName] attribute tells the compiler to insert the name of the containing member instead of a parameter’s default value.  Unlike [CallerLineNumber] and [CallerFileName], this has no equivalent in C++; since the C / C++ versions of these features are in the preprocessor, they cannot be aware of member names.

Most calls to methods with optional parameters take place within a named method, so the behavior of this attribute is usually obvious.  However, there are a couple of places where the exact method name is not so obvious.

If you call a [CallerMemberName] method inside a property or event accessor, what name should the compiler pass?  Common sense indicates that it should pass the name of the property or event, since that’s the name you actually see in source code.  That would also allow this attribute to be used for raising PropertyChanged events.  However, this option doesn’t pass enough information, since it would not be possible to determine whether it was called from the getter or the setter.  To expose the maximal amount of information, the compiler should pass the name of the actual method for the accessor – get_Name or set_Name

I would assume that the compiler only passes the property name, since that is what most people would probably expect.

A less-trivial question arises when such a method is called from a constructor or static constructor.  Should the compiler just pass the name of the class, since that’s what the member is named in source code? If so, there would be no way to distinguish between an instance constructor and a static constructor.  Should the compiler pass the actual names of the CLR methods (.ctor and .cctor)?  If so, there would be no way to tell the class name, which is worse.  Should it pass both (ClassName.ctor)? That would expose the maximal amount of information, but wouldn’t match the behavior in other members, which does not include the class name.

On a related note, what about calls to base or this constructors that take [CallerMemberName] arguments? Is that considered part of the class’ constructor, even though the call is lexically scoped outside the constructor?  If not, what should it pass?

A further related concern is field initializers. Since field initializers aren’t explicitly in any member, what should the compiler pass if you call a [CallerMemberName] method in a field initializer? I would assume that they’re treated like contructors (or static constructors for static field initializers)

I would assume that a call to a [CallerMemberName] method from within an anonymous method or LINQ query would use the name of the parent method.

The most interesting question concerns attributes.  What should happen if you declare your own custom attribute that takes a [CallerMemberName] parameter, then apply the attribute somewhere without specifying the parameter?

If you place the attribute on a parameter, return value, or method, it would make sense for the compiler to pass the name of the method that the attribute was applied to.  If you apply the attribute to a type, it might make sense to pass the name of that type.  However, there is no obvious choice for attributes applied to a module or assembly.

I suspect that they instead chose to not pass these caller info in default parameters for attribute declarations, and to instead pass the parameters’ declared default values.  If so, it would make sense to disallow caller info attributes in attribute constructor parameters.  However, this would also prevent them from being used for attributes that are explicitly instantiated in normal code (eg, for global filters in MVC).

Next Time: Caller Info Attributes vs. Stack Walking

Subtleties of C# 5’s new [CallerLineNumber]

UPDATE: Now that the Visual Studio 11 beta has shipped with this feature implemented, I wrote a separate blog post exploring how it actually behaves in these corner cases.

This is part 2 in a series about C# 5’s new caller info attributes; see the introduction.

The [CallerLineNumber] attribute tells the compiler to use the line number of the call site instead of the parameter’s default value.  This attribute has more corner cases than [CallerFileName].  In particular, unlike the C preprocessor’s __LINE__ macro, the C# compiler inserts the line number of a parsed method call.  Therefore, it is not always clear which line a method call expression maps to.

What should this call print:

static class Utils {
    int GetLine([CallerLineNumber] int line = 0) {
        return line;
    }
}

Console.WriteLine(
    Utils
        .
        GetLine
        (
        )
    );

Should it print the line number that the statement started? The line in which the call to GetLine started? The line containing the parentheses for GetLine? What if it’s in a multi-line lambda expression?

There are also a few cases in which methods are called implicitly by the compiler without appearing in source code.  What should this code print?

class Funny {
    public Funny Select(Func<object, object> d,
                        [CallerLineNumber]int line = 0) {
        Console.WriteLine(line + ": " + d(d));
        return this;
    }
}

var r = (from x in new Funny() 
         let y = 1 
         select 2);

This code contains two implicit calls to the Select method that don’t have a clear source line (it gets worse for more complicated LINQ queries)

In fact, it is possible to have an implicit method call with no corresponding source code at all.

Consider this code:

class Loggable {
    public Loggable([CallerLineNumber] int line = 0) { }
}
class SomeClass : Loggable { }

The compiler will implicitly generate a constructor for SomeClass that calls the base Loggable constructor with its default parameter value.  What line should it pass? In fact, if SomeClass is a partial class that is defined in multiple files, it isn’t even clear what [CallerFileName] should pass.

Also, what should happen in the unlikely case that a [CallerLineNumber] method is called on line 3 billion (which would overflow an int)? (This would be easier to test on Roslyn with a fake stream source) Should it give an integer overflow compile-time error?  If [CallerLineNumber] also supports byte and short parameters, this scenario will be more likely to happen in practice.

Next time: [CallerMemberName]

Subtleties of the new Caller Info Attributes in C# 5

UPDATE: Now that the Visual Studio 11 beta has shipped with this feature implemented, I wrote a separate blog post exploring how it actually behaves in these corner cases.

C# 5 is all about asynchronous programming.  However, in additional to the new async features, the C# team managed to slip in a much simpler feature: Caller Info Attributes.

Since C#’s inception, developers have asked for __LINE__ and __FILE__ macros like those in C and C++.  Since C# intentionally does not support macros, these requests have not been answered.  Until now.

C# 5 adds these features using attributes and optional parameters.  As Anders Hejlsberg presented at //Build/, you can write

public static void Log(string message,
    [CallerFilePath] string file = "",
    [CallerLineNumber] int line = 0,
    [CallerMemberName] string member = "")
{
    Console.WriteLine("{0}:{1} – {2}: {3}", 
                      file, line, member, message);

}

If this method is called in C# 5 without specifying the optional parameters, the compiler will insert the file name, line number, and containing member name instead of the default values.  If it’s called in older languages that don’t support optional parameters, those languages will pass the default values, like any other optional method.

These features look trivial at first glance.  However, like most features, they are actually more complicated to design in a solid and robust fashion.  Here are some of the less obvious issues that the C# team needed to deal with when creating this feature:

Disclaimer: I am basing these posts entirely on  logical deduction.  I do not have access to a specification or implementation of this feature; all I know is what Anders announced in his //Build/ presentation.  However, the C# team would have needed to somehow deal with each of thee issues.  Since these attributes are not yet supported by any public CTP, I can’t test my assumptions

To start with, they needed to create new compiler errors if the attribute is applied to an incorrectly-typed parameter, or a non-optional parameter.  Creating compiler errors is expensive; they need to be documented, tested, and localized into every language supported by C#.

The attributes should perhaps support nullable types, or parameter types with custom implicit conversions to int or string.  (especially long or short)

If a method with these attributes is called in an expression tree literal (a lambda expression converted to an Expression<TDelegate>), the compiler would need to insert ConstantExpressions with the actual line number or other info to pass as the parameter.

In addition to being supported in methods, the attributes should also be supported on parameters  for delegate types, allowing you to write

delegate void LinePrinter([CallerLineNumber] int line = 0);

LinePrinter printer = Console.WriteLine;
printer();

Each of the individual attributes also has subtle issues.

[CallerFilePath] seems fairly simple.  Any function call must happen in source code, in a source file that has a path.  However, it needs to take into account #line directives, so that, for example, it will work as expected in Razor views.  I don’t know what it does inside #line hidden, in which there isn’t a source file.

Next time: [CallerLineNumber]

Using a default controller in ASP.Net MVC

One common question about ASP.Net MVC is how to make “default” controller.

Most websites will have a Home controller with actions like About, FAQ, Privacy, or similar pages.  Ordinarily, these actions can only be accessed through URLs like ~/Home/About.  Most people would prefer to put these URLs directly off the root: ~/About, etc.

Unfortunately, there is no obvious way to do that in ASP.Net MVC without making a separate route or controller for each action.

You cannot simply create a route matching "/{action}" and map it to the Home controller, since such a route would match any URL with exactly one term, including URLs meant for other controllers.  Since the routing engine is not aware of MVC actions, it doesn’t know that this route should only match actions that actually exist on the controller.

To make it work, we can add a custom route constraint that forces this route to only match URLs that correspond to actual methods on the controller.

To this end, I wrote an extension method that scans a controller for all action methods and adds a route that matches actions in that controller. The code is available at gist.github.com/1225676.  It can be used like this:

routes.MapDefaultController<Controllers.HomeController>();

This maps the route "/{action}/{id}" (with id optional) to all actions defined in HomeController.   Note that this code ignores custom ActionNameSelectorAttributes (The built-in [ActionName(…)] is supported).

For additional flexibility, you can also create custom routes that will only match actions in a specific controller.  This is useful if you have a single controller with a number of actions that has special route requirements that differ from the rest of your site.

For example:

routes.MapControllerActions<UsersController>(
    name: "User routes",
    url:  "{userName}/{action}"
    defaults: new { action = "Index" }
);

(Note that this example will also match URLs intended for other controllers with the same actions; plan your routes carefully)

XRegExp breaks jQuery Animations

XRegExp is an open source JavaScript library that provides an augmented, extensible, cross-browser implementation of regular expressions, including support for additional syntax, flags, and methods.

It’s used by the popular SyntaxHighlighter script, which is in turn used by many websites (including this blog) to display syntax-highlighted source code on the client.  Thus, XRegExp has a rather wide usage base.

However, XRegExp conflicts with jQuery.  In IE, any page that includes XRegExp and runs a numeric jQuery animation will result in “TypeError: Object doesn't support this property or method”.

Demo (only fails in IE)

This bug is caused by an XRegExp fix for an IE bug in which the exec method doesn't consistently return undefined for nonparticipating capturing groups.  The fix, on line 271, assumes that the parameter passed to exec is a string.  This behavior violates the ECMAScript standard (section 15.10.6.2), which states that the parameter to exec should be converted to a string before proceeding.

jQuery relies on this behavior in its animate method, which parses a number using a regex to get the decimal portion.  (source)

Thus, calling animate with a number after loading XRegExp will fail in IE when XRegExp tries to call slice on a number.

Fortunately, it is very simple to fix XRegExp to convert its argument to a string first:

if (XRegExp) {
    var xExec = RegExp.prototype.exec;
    RegExp.prototype.exec = function(str) {
        if (!str.slice) 
            str = String(str);
        return xExec.call(this, str);
    };
}
Here is an updated demo that uses this fix and works even in IE.

Clarifying Boolean Parameters, part 2

Part 1 is here

Some languages have better ways to pass boolean parameters.  C# 4.0, and all versions of VB, allow parameters to be passed by name.  This allows us to write much clearer code:

//C# 4.0:
UpdateLayout(doFullLayout: false) 
'VB.Net:
UpdateLayout(doFullLayout:=False) 

Without requiring any changes to the function definition, this makes the meaning of the true / false abundantly clear at the call-site.

Javascript offers another interesting alternative.  In Javascript, booleans conditions actually check for “truthyness”.  The statement if(x) will trigger  not just if x is true, but also if x is any “truthy” value, including any object, non-empty string, or non-zero number. Similarly, the expression !x will return false if x is “truthy” and true if x “falsy”.

This means that we can actually use any non-empty string instead of true in Javascript.  Note that this will only work if the function checks the value for “truthyness”; it won’t work for code like if (x === true).

Thus, instead of passing true as a boolean, you can pass a string that describes what you’re actually indicating.

For example:

function updatePosition(animate) {
    //Calculate position
    if (animate)
        //...
    else
        //...
}

$(window).resize(function() {
    updatePosition();
});

updatePosition("With animation");

Although this results in much more readable code, it can be difficult to understand for people who aren’t familiar with this trick.  If the meaning of the parameter changes, you’ll need to hunt down every place that the function is called and change the string to reflect the new meaning.

Finally, unlike an enum, this does not scale to multiple options.  If you need to have more than two options, you should use global variables or objects to simulate an enum, not strings.

Clarifying Boolean Parameters, part 1

Have you ever written code like this:

public void UpdateLayout(bool doFullLayout) {
    //Code
    if (doFullLayout) {
        //Expensive code
    }
    //More code
}

This pattern is commonly used when some operation has a “cheap” mode and an “expensive” mode.  Other code will have calls like UpdateLayout(false) and UpdateLayout(true) scattered throughout.

The problem is that this isn’t very obvious for people who aren’t familiar with the codebase.  If you take a look at a file you’ve never seen before and see calls like UpdateLayout(false) and UpdateLayout(true) scattered, you’ll have no idea what the true / false means.

The simplest solution is to break it out into two methods: UpdateComplexLayout() and UpdateBasicLayout().  However,  if the two different layout modes have intertwined code paths (eg, the code before and after the if above), this either won’t be possible or will lead to ugly duplication of code.

One alternative is to use enums:

public enum LayoutUpdateType {
    Basic,
    Full
}

public void UpdateLayout(LayoutUpdateType type) {
    //Code
    if (type == LayoutUpdateType.Full) {
        //Expensive code
    }
    //More code
}

This way, the callsites are much more descriptive: UpdateLayout(LayoutUpdateType.Full).  This also makes it easy to add more update modes in the future should the need arise.  However, it makes the callsites much more verbose.  When used frequently, this pattern can lead a vast proliferation of enum types that are each only used by one method, polluting the namespace and making more important enums harder to notice.

Next time: Cleverer alternatives

C# is not type-safe

C# is usually touted as a type-safe language.  However, it is not actually fully type-safe!

To examine this claim, we must first provide a strict definition of type-safety  Wikipedia says:

In computer science, type safety is the extent to which a programming language discourages or prevents type errors. A type error is erroneous or undesirable program behavior caused by a discrepancy between differing data types.

To translate this to C#, full type-safety means that any expression that compiles is guaranteed to work at runtime, without causing any invalid cast errors.

Obviously, the cast (and as) operator is an escape hatch from type safety.  It tells the compiler that “I expect this value to actually be of this type, even though you can’t prove it.  If I’m wrong, I’ll live with that”.  Therefore, to be fully type-safe, it must be impossible to get an InvalidCastException at runtime in C# code that does not contain an explicit cast.

Note that parsing or conversion errors (such as any exception from the Convert class) don’t count.  Parsing errors aren’t actually invalid cast errors (instead, they come from unexpected strings), and conversion errors from from cast operations inside the Convert class.  Also, null reference exceptions aren’t cast errors. 

So, why isn’t C# type-safe?

MSDN says that InvalidCastException is thrown in two conditions:

  • For a conversion from a Single or a Double to a Decimal, the source value is infinity, Not-a-Number (NaN), or too large to be represented as the destination type.

  • A failure occurs during an explicit reference conversion.

Both of these conditions can only occur from a cast operation, so it looks like C# is in fact type safe.

Or is it?

IEnumerable numbers = new int[] { 1, 2, 3 };

foreach(string x in numbers) 
    ;

This code compiles (!). Running it results in

InvalidCastException: Unable to cast object of type 'System.Int32' to type 'System.String'.

On the foreach line.

Since we don’t have any explicit cast operations (The implicit conversion from int[] to IEnumerable is an implicit conversion, which is guaranteed to succeed) , this proves that C# is not type-safe.

What happened?

The foreach construct comes from C# 1.0, before generics existed.  It worked with untyped collections such as ArrayList or IEnumerable.  Therefore, the IEnumerator.Current property that gets assigned to the loop variable would usually be of type object.   (In fact, the foreach statement is duck-typed to allow the enumerator to provide a typed Current property, particularly to avoid boxing). 

Therefore, you would expect that almost all (non-generic) foreach loops would need to have the loop variable declared as object, since that’s the compile-time type of the items in the collection.  Since that would be extremely annoying, the compiler allows you to use any type you want, and will implicitly cast the Current values to the type you declared.  Thus, mis-declaring the type results in an InvalidCastException.

Note that if the foreach type isn’t compatible at all with the type of the Current property, you will get a compile-time error (just like (string)42 doesn’t compile).  Therefore, if you stick with generic collections, you’re won’t get these runtime errors (unless you declare the foreach as a subtype of the item type).

C# also isn’t type-safe because of array covariance.

string[] strings = new string[1];
object[] arr = strings;
arr[0] = 7;

This code compiles, but throws “ArrayTypeMismatchException: Attempted to access an element as a type incompatible with the array.” at run-time.

As Eric Lippert explains, this feature was added in order to be more compatible with Java.

Delegates vs. Function Pointers, Addendum: Multicast Delegates

Until now, I've been focusing on only one of the differences between delegates and function pointers; namely, associated state.
Delegates have one other capability that function pointers do not.  A single function pointer can only point to one function.  .Net, on the other hand, supports multicast delegates – delegates that point to multiple functions.  You can combine two existing delegates using the + operator (or by calling Delegate.Combine) to create a single new delegate instance that points two all of the methods in the original two delegates.  This new delegate stores all of the methods from the original two delegates in a private array of delegates called InvocationList (the delegates in this array are ordinary non-multicast delegates that each only point to a single method). 

Note that delegates, like strings, are immutable.  Adding two delegates together creates a third delegate containing the methods from the first two; the original delegate instances are not affected.  For example, writing delegateField += SomeMethod creates a new delegate instance containing the methods originally in delegateField as well as SomeMethod, then stores this new instance in delegateField.

Similarly, the - operator (or Delegate.Remove) will remove the second operand from the first one (again, returning a new delegate instance).  If the second operand has multiple methods, all of them will be removed from the final delegate.  If some of the methods in the second operand appear multiple times in the original delegate, only the last occurrence of each one will be removed (the one most recently added).  The RemoveAll method will remove all occurrences.  If all of the methods were removed, it will return null; there is no such thing as an empty delegate instance.

Multicast delegates are not intended to be used with delegates that return values.  If you call a non-void delegate that contains multiple methods, it will return the return value of the last method in the delegate.  If you want to see the return values of all of the methods, you’ll need to loop over GetInvocationList() and call each delegate individually.

Multicast delegates also don’t play well with the new covariant and contravariant generic delegates in .Net 4.0.  You cannot combine two delegates unless their types match exactly, including variant generic parameters.

Function pointers cannot easily be combined the way multicast delegates can.  The only way to combine function pointers without cooperation from the code that calls the pointer is to make a function that uses a closure to call all of the function pointers you want to call.

In Javascript, that would look like this:

function combine() {
    var methods = arguments;

    return function() { 
        var retVal;
        for(var i = 0; i < methods.length; i++) 
            retVal = methods[i].apply(this, arguments);
        return retVal;
    };
}

Tracking Event Handler Registrations

When working with large .Net applications, it can be useful to find out where event handlers are being registered, especially in an unfamiliar codebase.

In simple cases, you can do this by right-clicking the event definition and clicking Find All References (Shift+F12).  This will show you every line of code that adds or removes a handler from the event by name.  For field-like (ordinary) events, this will also show you every line of code that raises the event.

However, this isn’t always good enough.  Sometimes, event handlers are not added by name.  The .Net data-binding infrastructure, as well as the CompositeUI Event Broker service, will add and remove event handlers using reflection, so they won’t be found by Find All References.  Similarly, if an event handler is added by an external DLL, Find All References won’t find it.

For these scenarios, you can use a less-obvious trick.  As I described last time, adding or removing an event handler actually executes code inside of an accessor method. Like any other code, we can set a breakpoint to see where the code is executed.

For custom events, this is easy.  Just add a breakpoint in the add and/or remove accessors and run your program.  Whenever a handler is added or removed, the debugger will break into the accessor, and you can look at the callstack to determine where it’s coming from.

However, most events are field-like, and don’t have actual source code in their accessor methods.  To set a breakpoint in a field-like event, you need to use a lesser-known feature: function breakpoints (Unfortunately, this feature is not available in Visual Studio Express).  You can click Debug, New Breakpoint, Break at Function (Ctrl+D, N) to tell the debugger to pause whenever a specific managed function is executed.

To add a breakpoint at an event accessor, type Namespace.ClassName.add_EventName.  To ensure that you entered it correctly, open the Debug, Breakpoints window (Ctrl+D, B) and check that the new breakpoint says break always (currently 0) in the Hit Count column.  If it doesn’t say (currently 0), then either the assembly has not been loaded yet or you made a typo in the location (right-click the breakpoint and click Location).

About .Net Events

A .Net event actually consists of a pair of accessor methods named add_EventName and remove_EventName.  These functions each take a handler delegate, and are expected to add or remove that delegate from the list of event handlers. 

In C#, writing public event EventHandler EventName; creates a field-like event.  The compiler will automatically generate a private backing field (also a delegate), along with thread-safe accessor methods that add and remove handlers from the backing field (like an auto-implemented property).  Within the class that declared the event, EventName refers to this private backing field.  Thus, writing EventName(...) in the class calls this field and raises the event (if no handlers have been added, the field will be null).

You can also write custom event accessors to gain full control over how handlers are added to your events.   For example, this event will store and trigger handlers in reverse order:

void Main()
{
    ReversedEvent += delegate { Console.WriteLine(1); };
    ReversedEvent += delegate { Console.WriteLine(2); };
    ReversedEvent += delegate { Console.WriteLine(3); };

    OnReversedEvent();
}

protected void OnReversedEvent() {
    if (reversedEvent != null)
        reversedEvent(this, EventArgs.Empty);
}

private EventHandler reversedEvent;
public event EventHandler ReversedEvent {
    add {
        reversedEvent = value + reversedEvent;
    }
    remove {
        reversedEvent -= value;
    }
}

This add accessor uses the non-commutative delegate addition operator to prepend each new handler to the delegate field containing the existing handlers.  The raiser method simply calls the combined delegate in the private field. (which is null if there aren’t any handlers)

Note that this code is not thread-safe.  If two threads add a handler at the same time, both of them will read the original storage field, add their respective handlers to create a new delegate instance, then write this new delegate back to the field.  The thread that writes back to the field last will overwrite the changes made by the other thread, since it never saw the other thread’s handler (this is the same reason that x += y is not thread-safe).  The accessors generated by the compiler are threadsafe, either by using lock(this) (C# 3 or earlier) or a lock-free threadsafe implementation (C# 4).  For more details, see this series of blog posts.

This example is rather useless.  However, there are better reasons to create custom event accessors. WinForms controls store their events in a special EventHandlerList class to save memory.  WPF controls create events using the Routed Event system, and store handlers in special storage in DependencyObject.  Custom event accessors can also be used to perform validation or logging.