Showing posts with label closures. Show all posts
Showing posts with label closures. Show all posts

Delegates vs. Function Pointers, part 5: Javascript

This is part 5 in a series about state and function pointers; part 1 is here.

Last time, we saw how C# 2 supports closures by compiling anonymous functions into member functions of a special class that holds local state from the outer function. 

Unlike the languages we’ve looked at before, Javascript has had closures baked in to the languages since its inception.  My standard example can be achieved very simply in Javascript:

var x = 2;
var numbers = [ 1, 2, 3, 4 ];
var hugeNumbers = numbers.filter(function(n) { return n > x; });

This code uses the Array.filter method, new to Javascript 1.6, to create a new array with those elements from the first array that pass a callback.  The function expression passed to filter captures the x variable for use inside the callback.

This looks extremely similar to the C# 2.0 version from last time.  However. under the covers, it’s rather different.

Like .Net managed instance methods, all Javascript functions take a hidden this parameter.  However, unlike .Net, Javascript does not have delegates.  There is no (intrinsic) way to bind an object to the this parameter the way a .Net closed delegate does.  Instead, the this parameter comes from the callsite, depending on how the function was called.  Therefore, we cannot pass state in the this parameter the way we did in C#.

Instead, all Javascript function expressions capture the variable environment of the scope that they are declared in as a hidden property of the function.  Therefore, a function can reference local variables from its declaring scope.  Unlike C#, which binds functions to their parent scopes using a field in a separate delegate object that points to the function, Javascript functions have their parent scopes baked in to the functions themselves. 

Javascript doesn’t have separate delegate objects that can hold a function and a this parameter.  Instead, the value of the this parameter is determined at the call-site, depending on how the function was called.  This is a common source of confusion to inexperienced Javascript developers.

To simulate closed delegates, we can make a method that takes a function as well as a target object to call it on, and returns a new function which calls the original function with this equal to the target parameter.  That sounds overwhelmingly complicated, but it’s actually not that hard:

function createDelegate(func, target) {
    return function() { 
        return func.apply(target, arguments);
    };
}

var myObject = { name: "Target!"};
function myMethod() {
    return this.name;
}

var delegate = createDelegate(myMethod, myObject);
alert(delegate());

This createDelegate method returns a function expression that captures the func and target parameters, and calls func in the context of target.  Instead of storing the target in a property of a Delegate object (like .Net does), this code stores it in the inner function expression’s closure.

Javascript 1.8.5 provides the Function.bind method, which is equivalent to this createDelegate method, with additional capabilities as well.  In Chrome, Firefox 4, and IE9, you can write

var myObject = { name: "Target!"};
function myMethod() {
    return this.name;
}

var delegate = myMethod.bind(myObject);
alert(delegate());
For more information, see the MDN documentation.

Delegates vs. Function Pointers, part 4: C# 2.0+

This is part 4 in a series about state and function pointers; part 1 is here.

Last time, we saw that it is possible to pass local state with a delegate in C#.  However, it involves lots of repetitive single-use classes, leading to ugly code.

To alleviate this tedious task, C# 2 supports anonymous methods, which allow you to embed a function inside another function.  This makes my standard example much simpler:

//C# 2.0
int x = 2;
int[] numbers = { 1, 2, 3, 4 };

int[] hugeNumbers = Array.FindAll(
    numbers, 
    delegate(int n) { return n > x; }
);



//C# 3.0
int x = 2;
int[] numbers = { 1, 2, 3, 4 };

IEnumerable<int> hugeNumbers = numbers.Where(n => n > x);

Clearly, this is much simpler than the C# 1.0 version from last time.  However, anonymous methods and lambda expressions are compile-time features; the CLR itself is not aware of them.  How does this code work? How can an anonymous method use a local variable from its parent scope?

This is an example of a closure – a function bundled together with external variables that the function uses.  The C# compiler handles this the same way that I did manually last time in C# 1: it generates a class to hold the function and the variables that it uses, then creates a delegate from the member function in the class.  Thus, the local state is passed as the delegate’s this parameter.

To see how the C# compiler implements closures, I’ll use ILSpy to decompile the more-familiar C# 3 version: (I simplified the compiler-generated names for readability)

[CompilerGenerated]
private sealed class ClosureClass {
    public int x;
    public bool Lambda(int n) {
        return n > this.x;
    }
}
private static void Main() {
    ClosureClass closure = new ClosureClass();
    closure.x = 2;
    int[] numbers = { 1, 2, 3, 4 };
    IEnumerable<int> hugeNumbers = numbers.Where(closure.Lambda);
}

The ClosureClass (which was actually named <>c__DisplayClass1) is equivalent to the GreaterThan class from my previous example.  It holds the local variables used in the lambda expression.  Note that this class replaces the variables – in the original method, instead a local variable named x, the compiler uses the public x field from the ClosureClass.  This means that any changes to the variable affect the lambda expression as well.

The lambda expression is compiled into the Lambda method (which was originally named <Main>b__0).  It uses the same field to access the local variable, sharing state between the original outer function and its lambda expression.

Next time: Javascript

Delegates vs. Function Pointers, part 2: C

This is part 2 in a series about state and function pointers; part 1 is here.

Unlike most other languages, it is not possible to include any form of state in a function pointer in C.  Therefore, it is impossible to fully implement closures in C without the cooperation of the call-site and/or the compiler.

To illustrate what this means in practice, I will refer to my standard example, using a filter function to find all elements in an array that are greater than x.  Since C doesn’t have any such function, I’ll write one myself, inspired by qsort:

void* filter(void* base, 
             size_t count, 
             size_t elemSize, 
             int (*predicate)(const void *)) {
    //Magic...
}

With this function, one can easily find all elements that are greater than some constant:

int predicate(const void* item) {
        return *(int*)item > 2;
}

int arr[] = { 1, 2, 3, 4 };
filter(arr, 4, sizeof(int), &predicate);

However, if we want to replace the hard-coded 2 with a local variable, it becomes more complicated.  Since function pointers, unlike delegates, do not have state, there is no simple way to pass the local variable to the function.

The most obvious answer is to use a global variable:

static int minimum;
int predicate(const void* item) {
        return *(int*)item > minimum;
}

int arr[] = { 1, 2, 3, 4 };
minimum = x;
filter(arr, 4, sizeof(int), &predicate);

However, this code will fail horribly if it’s ever run on multiple threads.  In simple cases, you can work around that by storing the value in a global thread-local variable (or in fiber-local storage for code that uses fibers).   Because each thread will have its own global, the threads won’t conflict with each-other.  Obviously, this won’t work if the callback might be run on a different thread.

However, even if everything is running on one thread, this still isn’t enough if the code can be re-entrant, or if the callback might be called after the original function call finishes.  This technique assumes that exactly one callback (from a single call to the function that accepted the callback) will be used at any given time.  If the callback might be invoked later (eg, a timer, or a UI handler), this method will break down, since all of the callbacks will share the same global.

In these more complicated cases, one can only pass any state with the cooperation of the original function.  In this (overly simple) example, the filter method can be modified to take a context parameter and pass it to the predicate:

void* filter(void* base, 
             size_t count,
             size_t elemSize, 
             int (*predicate)(const void *, const void *), 
             const void* context) {
        //More magic...
}
int predicate(const void* item, const void* context) {
        return *(int*)item > *(int*)context;
}

int arr[] = { 1, 2, 3, 4 };
filter(arr, 4, sizeof(int), &predicate, &x);

To pass around more complicated state, one could pass a pointer to a struct as the context.

Next Time: C# 1.0

Delegates vs. Function Pointers, part 1

Most languages – with the unfortunate exception of Java – allow functions to be passed around as variables.  C has function pointers, .Net has delegates, and Javascript and most functional programming languages treat functions as first class objects.

There is a fundamental difference between C-style function pointers vs. delegates or function objects.  Pure function pointers cannot hold any state other than the function itself.  In contrast, delegates and function objects do store additional state that the function can use.

To illustrate this difference,  I will use a simple example.  Most programming environments have a filter function that takes a collection and a predicate (a function that takes an item from the collection and returns a boolean), and returns a new collection with those items from the original collection that match the predicate.  In C#, this is the LINQ Where method; in Javascript, it’s the Array.filter method (introduced by Javascript 1.6).

Given a list of numbers, you can use such a method to find all numbers in the list that are greater than a local variable x.  However, in order to do that, the predicate have access to x.  In subsequent posts, I’ll explore how this can be done in different languages.