Thursday, May 11, 2006

OK, back to some geek stuff.

You all know I'm into Lexical Closures, right?

So I've been messing with this kind of stuff for a while in Python.


def f(x) :
def g(y) :
return x * y
return g

h = f(2)
h2 = f(3)

print h(5)
print h2(5)


In the above example, the variables h and h2 are set to two closures. Essentially copies of the function g + a context which binds the value of the variable x. In the first case, x is bound to 2, in the second 3. And so calling h(5) and h2(5) give the results 10 and 15 respectively.

This is the sort of thing that makes Python so cool of course. Although it's been around since Scheme.

But then I saw a reference to lexical closures in C using the Gnu GCC compiler. That didn't sound right, I thought. But the article, although not offering a version of the above, did suggest functions could be defined inside others and inherit the values of variables in the outer function's scope.

Here's the (more or less) direct translation of the above program into C.


typedef int (*FPTR)(int x);

FPTR f(int x) {
int g(int y) {
return x * y;
}
return &g;
}

int main(int argc, char *argv[]) {
FPTR h, h2;
h = f(2);
printf("%d\n",h(5));
h2 = f(3);
printf("%d\n",h2(5));
system("PAUSE");
return 0;
}


To my astonishment, this compiles. And seems willing to do what I hoped it would ... except the answer comes out wrong. Both h(5) and h2(5) produce 25. What's up?

Well, 25 looks suspiciously like 5 * 5, and after testing with a couple of other arguments for h it becomes clear that when the inner-function g is run, both y and x are bound to the argument which is passed as y.

Hmm ... so presumably something rather simple is going on in C. Such as names are really being bound with "the first argument off the stack" or something. A slight modification to f shows that I'm on the right track.


FPTR f(int x) {
int t;
t = x;
int g(int y) {
return t * y;
}
return &g;
}


Now I copy the argument for f into a local variable called t. And it works as one would hope : h(5) -> 10 and h2(5) -> 15.

At this point I'm very excited. I know this isn't Ansi C but it's still very cool to have this kind of thing in C.

Unfortunately, the next experiment dashes my hopes.

After having called f for the second time (with the argument 3) does the closure assigned to h (where x was assigned 2) still exist, or will h(5) now also produce 15?

I try it, and the program crashes. Straight away. In fact, whatever I try, I don't seem to be able to call the same closure twice. Which makes the whole thing interesting but next to useless in practice. Ah well.

What does work, as expected, is making the local variable t static :


FPTR f(int x) {
static int t;
t = x;
int g(int y) {
return t * y;
}
return &g;
}


Now this :


h = f(2);
printf("%d\n",h(5));
h2 = f(3);
printf("%d\n",h(5));


produces 10, 15 as you'd expect, even though we're now calling h a second time instead of h2. Remember in h, t is meant to be 2. But the call to f(3) has changed its value.

Fun stuff, anyway.

Update : this suggests that I can't really have the closures like I want :


If you try to call the nested function through its address after the containing function has exited, all hell will break loose. If you try to call it after a containing scope level has exited, and if it refers to some of the variables that are no longer in scope, you may be lucky, but it's not wise to take the risk. If, however, the nested function does not refer to anything that has gone out of scope, you should be safe.

No comments: