Delegates vs. Function Pointers, part 4: C# 2.0+

This is part 4 in a series about state and function pointers; part 1 is here.

Last time, we saw that it is possible to pass local state with a delegate in C#.  However, it involves lots of repetitive single-use classes, leading to ugly code.

To alleviate this tedious task, C# 2 supports anonymous methods, which allow you to embed a function inside another function.  This makes my standard example much simpler:

//C# 2.0
int x = 2;
int[] numbers = { 1, 2, 3, 4 };

int[] hugeNumbers = Array.FindAll(
    numbers, 
    delegate(int n) { return n > x; }
);



//C# 3.0
int x = 2;
int[] numbers = { 1, 2, 3, 4 };

IEnumerable<int> hugeNumbers = numbers.Where(n => n > x);

Clearly, this is much simpler than the C# 1.0 version from last time.  However, anonymous methods and lambda expressions are compile-time features; the CLR itself is not aware of them.  How does this code work? How can an anonymous method use a local variable from its parent scope?

This is an example of a closure – a function bundled together with external variables that the function uses.  The C# compiler handles this the same way that I did manually last time in C# 1: it generates a class to hold the function and the variables that it uses, then creates a delegate from the member function in the class.  Thus, the local state is passed as the delegate’s this parameter.

To see how the C# compiler implements closures, I’ll use ILSpy to decompile the more-familiar C# 3 version: (I simplified the compiler-generated names for readability)

[CompilerGenerated]
private sealed class ClosureClass {
    public int x;
    public bool Lambda(int n) {
        return n > this.x;
    }
}
private static void Main() {
    ClosureClass closure = new ClosureClass();
    closure.x = 2;
    int[] numbers = { 1, 2, 3, 4 };
    IEnumerable<int> hugeNumbers = numbers.Where(closure.Lambda);
}

The ClosureClass (which was actually named <>c__DisplayClass1) is equivalent to the GreaterThan class from my previous example.  It holds the local variables used in the lambda expression.  Note that this class replaces the variables – in the original method, instead a local variable named x, the compiler uses the public x field from the ClosureClass.  This means that any changes to the variable affect the lambda expression as well.

The lambda expression is compiled into the Lambda method (which was originally named <Main>b__0).  It uses the same field to access the local variable, sharing state between the original outer function and its lambda expression.

Next time: Javascript

Open Delegates vs. Closed Delegates

.Net supports two kinds of delegates: Open delegates and closed delegates.

When you create a delegate that points to an instance method, the instance that you created it from is stored in the delegate’s Target property.  This property is passed as the first parameter to the method that the delegate points to.  For instance methods, this is the implicit this parameter; for static methods, it's the method's first parameter.  These are called closed delegates, because they close over the first parameter and bring it into the delegate instance.

It is also possible to create open delegates which do not pass a first parameter.  Open delegates do not use the Target property; instead, all of the target method’s parameters are passed from the delegate’s formal parameter list, including the first parameter.  Therefore, an open delegate pointing to a given method must have one parameter more than a closed delegate pointing to the same method.  Open delegates are usually used to point to static methods.  When you make a delegate pointing to a static method, you (generally) don't want the delegate to hold a first parameter for the method. 

In addition to these two normal cases, it is also possible (in .Net 2.0 and later) to create open delegates for instance methods and to create closed delegates for static methods.  With one exception, C# doesn’t have any syntactical support for these unusual delegates, so they can only be created by calling the CreateDelegate method.

Open delegates by calling the CreateDelegate overload that doesn’t take a target parameter.  Before .Net 2.0, this function could only be called with a static method.   In .Net 2.0, you can call this function with an instance method to create an open delegate.  Such a delegate will call use its first parameter as this instead of its Target field. 

As a concrete example, consider the String.ToUpperInvariant() method.  Ordinarily, this method takes no parameters, and operates on the string it’s called on.  An open delegate pointing to this instance method would take a single string parameter, and call the method on that parameter.

For example:

Func<string> closed = new Func<string>("a".ToUpperInvariant);
Func<string, string> open = (Func<string, string>)
    Delegate.CreateDelegate(
        typeof(Func<string, string>),
        typeof(string).GetMethod("ToUpperInvariant")
    );

closed();     //Returns "A"
open("abc");  //Returns "ABC"

Closed delegates are created by calling the CreateDelegate overload that takes a target parameter.  In .Net 2.0, this can be called with a static method and an instance of that method’s first argument type to create a closed delegate that calls the method with the given target as its first parameter.  Closed delegates curry the first parameter from the target method.  For example:

Func<object, bool> deepThought = (Func<object, bool>)
    Delegate.CreateDelegate(
        typeof(Func<object, bool>),
        2,
        typeof(object).GetMethod("Equals", BindingFlags.Static | BindingFlags.Public)
    );

This code curries the static Object.Equals method to create a delegate that calls Equals with 2 and the delegate’s single parameter).  It’s equivalent to x => Object.Equals(2, x).  Note that since the method is (generally) not a member of the target object’s type, we need to pass an actual MethodInfo instance; a name alone isn’t good enough.

Note that you cannot create a closed delegate from a static method whose first parameter is a value type, because, unlike all instance methods, static delegates that take value types receive their parameters by value, not as a reference.   For more details, see here and here.

C# 3 added limited syntactical support for creating closed delegates.  You can create a delegate from an extension method as if it were an instance method on the type it extends.  For example:

var allNumbers = Enumerable.Range(1, Int32.MaxValue);
Func<int, IEnumerable<int>> countTo = allNumbers.Take;

This code creates an IEnumerable<int> containing all positive integers, then creates a closed delegate that curries this sequence into the static Enumerable.Take<T>(IEnumerable<T>) method.

Except for extension methods, open instance delegates and closed static delegates are rarely used in actual code.  However, it is important to understand how ordinary open and closed delegates work, and where the target object ends up for instance delegates.

Delegates vs. Function Pointers, part 3: C# 1.0

This is part 3 in a series about state and function pointers; part 1 is here.

Last time, we saw that it is impossible to bundle context along with a function pointer in C.

In C#, it is possible to fully achieve my standard example.  In order to explain how this works behind the scenes, I will limit this post to C# 1.0 and not use a lambda expression.  This also means no LINQ, generics, or extension methods, so I will, once again, need to write the filter method myself.

delegate bool IntFilter(int num);

static ArrayList Filter(IEnumerable source, IntFilter filter) {
    ArrayList retVal = new ArrayList();

    foreach(int i in source)
        if (filter(i))
            retVal.Add(i);
            
    return retVal;
}

class GreaterThan {
    public GreaterThan(int b) {
        this.bound = b;
    }
    int bound;
    public bool Passes(int num) {
        return num > bound;
    }
}

int x = 2;
int[] numbers = { 1, 2, 3, 4 };
Filter(numbers, new IntFilter(new GreaterThan(x).Passes));

Please excuse the ugly code; in order to be true to C# 1.0, I can’t just write an iterator, and I can’t implicitly create the delegate.

This code creates a class called GreaterThan that holds the Passes method and the state used by the method.  I create a delegate to pass to Filter out of the Passes method for an instance of the GreaterThan class from the local variable.

To understand how this works, we need to delve further into delegates.  .Net delegates are more than just type-safe function pointers. Unlike the function pointers we've looked at, delegates include state – the Target property. This property stores the object to pass as the hidden this parameter to the method (It's actually somewhat more complicated than that; I will describe open and closed delegates at a later date).  My example creates a delegate instance in which the Target property points to the GreaterThan instance.  When I call the delegate, the Passes method is called on the correct instance from the Target property, so it can read the bound field.

Next time, we'll see how the C# 2.0+ compilers generate all this boilerplate for you using anonymous methods and lambda expressions.

Delegates vs. Function Pointers, part 2: C

This is part 2 in a series about state and function pointers; part 1 is here.

Unlike most other languages, it is not possible to include any form of state in a function pointer in C.  Therefore, it is impossible to fully implement closures in C without the cooperation of the call-site and/or the compiler.

To illustrate what this means in practice, I will refer to my standard example, using a filter function to find all elements in an array that are greater than x.  Since C doesn’t have any such function, I’ll write one myself, inspired by qsort:

void* filter(void* base, 
             size_t count, 
             size_t elemSize, 
             int (*predicate)(const void *)) {
    //Magic...
}

With this function, one can easily find all elements that are greater than some constant:

int predicate(const void* item) {
        return *(int*)item > 2;
}

int arr[] = { 1, 2, 3, 4 };
filter(arr, 4, sizeof(int), &predicate);

However, if we want to replace the hard-coded 2 with a local variable, it becomes more complicated.  Since function pointers, unlike delegates, do not have state, there is no simple way to pass the local variable to the function.

The most obvious answer is to use a global variable:

static int minimum;
int predicate(const void* item) {
        return *(int*)item > minimum;
}

int arr[] = { 1, 2, 3, 4 };
minimum = x;
filter(arr, 4, sizeof(int), &predicate);

However, this code will fail horribly if it’s ever run on multiple threads.  In simple cases, you can work around that by storing the value in a global thread-local variable (or in fiber-local storage for code that uses fibers).   Because each thread will have its own global, the threads won’t conflict with each-other.  Obviously, this won’t work if the callback might be run on a different thread.

However, even if everything is running on one thread, this still isn’t enough if the code can be re-entrant, or if the callback might be called after the original function call finishes.  This technique assumes that exactly one callback (from a single call to the function that accepted the callback) will be used at any given time.  If the callback might be invoked later (eg, a timer, or a UI handler), this method will break down, since all of the callbacks will share the same global.

In these more complicated cases, one can only pass any state with the cooperation of the original function.  In this (overly simple) example, the filter method can be modified to take a context parameter and pass it to the predicate:

void* filter(void* base, 
             size_t count,
             size_t elemSize, 
             int (*predicate)(const void *, const void *), 
             const void* context) {
        //More magic...
}
int predicate(const void* item, const void* context) {
        return *(int*)item > *(int*)context;
}

int arr[] = { 1, 2, 3, 4 };
filter(arr, 4, sizeof(int), &predicate, &x);

To pass around more complicated state, one could pass a pointer to a struct as the context.

Next Time: C# 1.0

Delegates vs. Function Pointers, part 1

Most languages – with the unfortunate exception of Java – allow functions to be passed around as variables.  C has function pointers, .Net has delegates, and Javascript and most functional programming languages treat functions as first class objects.

There is a fundamental difference between C-style function pointers vs. delegates or function objects.  Pure function pointers cannot hold any state other than the function itself.  In contrast, delegates and function objects do store additional state that the function can use.

To illustrate this difference,  I will use a simple example.  Most programming environments have a filter function that takes a collection and a predicate (a function that takes an item from the collection and returns a boolean), and returns a new collection with those items from the original collection that match the predicate.  In C#, this is the LINQ Where method; in Javascript, it’s the Array.filter method (introduced by Javascript 1.6).

Given a list of numbers, you can use such a method to find all numbers in the list that are greater than a local variable x.  However, in order to do that, the predicate have access to x.  In subsequent posts, I’ll explore how this can be done in different languages.