My Pages

Wednesday, November 24, 2010

Dynamic versus Static Scoping

Dynamic and static scoping both have specific places in programming. We're going to discuss what the difference is and why the differences can make a difference in your program.

What is scoping
When we are talking about scoping, we are mainly concerned about the bindings of variables. In other words, where they have been defined. In some languages such as Perl there is no actual need to define a variable but that is a discussion for a different day. Regardless, bindings of variables show that they are variables and of what type they are. Below you can see the bindings of the variable x and variable y as being an integer and float respectively.

int x;
float y;

x = 10;
y = 3.14;


When we begin talking about scoping, we are specifically talking about scoping within higher order functions, closures, and Lambda Functions. Why? Because this is where scoping really matters within the variables. For example, in a language such as C, there is no purpose to talk about static versus dynamic scoping because everything is static. It will make more sense once we begin talking about it. Essentially variable bindings decide what variable a specific variable is talking about. For example, we KNOW that x is referring to int x.

Static Scoping
This is the scoping style that most people are used to and is the standard. So let's look at an explanation; essentially, the variable that is free (so during a closure or higher order function) is bound to the closest variable when the function is defined. An example is shown below.

var x=20;
var newfunc = f();
newfunc();

function f() {
  var x = 10;
  return function s() { print x; };
}


So, what prints out? You may or may not be surprised to find out that "10" will be printed out. Why? Because the x that is bound is the x from within f(); this is because the variable is bound statically and usually at compile time.

Dynamic Scoping
This is the scoping style that most people are not used to and can sometimes provide really strange results during programming. So let's look at an explanation; essentially, the variable that is free (so during a closure or higher order function) is bound to the closest variable in the activation records during run time. So what does this mean? The variable is bound to whatever the closest variable name is when the function is executed.

function d() {
  var x=20;
  var newfunc = f();
  newfunc();
}

var x=30;
d();

function f() {
  var x = 10;
  return function s() { print x; };
}


In the example shown above, one may be surprised to see that the output would be "20"; but why? the reason is quite surprising. The x is a free variable that must be bound at runtime; so when we actually execute newfunc() we look back through the activation records until we find a variable x that can be used. So we look at the AR for newfunc but find no x. We then check the AR for d and find that there is a local variable x defined as 20. This becomes the binding for our x. Now below you can see one more example of this.

function d() {
  var newfunc = f();
  newfunc();
}

function n() {
  var x=30;
  d();
}

n();

function f() {
  var x = 10;
  return function s() { print x; };
}


In this instance, the output changes again; now we see "30" as the output. Why? Let's step through it. We execute newfunc() and we see the free variable x that we need to bind. We then check the AR of newfunc but find no definition of x. We then look in the AR of d. Again, we see no definition of x so we continue up the AR's to n's definition. Here we now find an x which we then use which was defined as "30."

Let's get an example of both in Perl; static scoping is done via the my operator and the local operator provides dynamic scoping. Please note that by using the local operator, "use strict" will cause an error with using the variable and will say that the variable is not declared. Let's look at both examples.

Static
The output is "10", again, notice that the $x is bound to 10 from when the Lambda function is created.

sub f() {
  my $x = 10;
  return sub { print $x . "\n"; };
}

my $func = f();
my $x = 20;
&{$func}();


Dynamic
The output is "20", again, notice that the $x is bound to 20 from when the Lambda function is actually executed.

sub f() {
  local $x = 10;
  return sub { print $x . "\n"; };
}

my $func = f();
local $x = 20;
&{$func}();


Why is this useful?
Well... Let's be honest, essentially it's like passing a variable to a function. However, you could set it up so that you could dynamically change a function's internal variable if all variables were free dynamically scoped variables. This would allow for other functions to change the meanings of the functions without having to pass all the variables in. In my experience this is really not that useful. But knowing what the difference is and knowing how to tell which mode a variable is assigned to (in languages like Lisp and Perl that support both) is important. Or if you end up having to maintain a piece of code that uses dynamic scoping, you will know what exactly it means and the side effects of them.

Higher Order Functions (in Perl)

What is a higher order function and why exactly would I use one? Well let's look first at what a higher order function is.

What is a higher order function?


There are many languages that support higher order functions (2nd order, 3rd order etc...) and many of them are pretty straight forward. Languages such as Ruby, Lisp, Scala, and Javascript fully support higher order functions. In languages such as Lisp they are normally referred to as Lambda functions. Now that we've been talking tons about different names and who supports higher order functions; let's look at the original question. What is a higher order function?

A higher order function is a function that does processing on another function. To be exact, depending on the order of the function will depend on whether IT is calling or returning a function. We'll look at both of these examples later, but for right now just know that there are many different orders of functions. Let's look at an example of a higher order function (NOTE: This is not code from an actual language but resembles Javascript)

Returning a function
function f() {
  return function (y) {
    return y * y;
  }
}
var newfunc = f();
var x = newfunc(10);




Passing a function to a function
function f(x) {
  return x(10);
}

function t(n) { return n * n; }

f(t);
f(function (n) { return n * n;});


Let's take apart both of these examples, first we see another function defined within f() without a name (this is known as a Lambda or anonymous function). The function itself does a very simple y * y calculation where y is defined as a parameter that the function takes when being executed. As you can see by the next portion, we execute this by running f() which returns the newly created function which we can then execute at any other time.

Looking at the second example, we can see that f(x) is defined as x(10). This means that we are assuming that x will be a function which takes at least one argument. Continuing in the code where can see where we created another function t(n) which takes a single argument and return n * n. As we can see, we execute f() by passing t as the function to execute. The last portion shows that we can actually pass a Lambda function to f(x) denoting that we do not actually have to have a full function to execute it.

Now, in both of these examples we've looked at 2nd order functions; this means that the functions we are returning and the functions we are accepting are 1st order functions themselves. 1st order functions do not accept or return any other functions, these are just your normal everyday functions. So what would it mean to have a 3rd order function? This would be a function that either accepts a function that accepts/returns a function or a function that returns a function that accepts/returns a function.

Examples in Perl
The first question you might be asking is why am I explaining this in Perl? Well mainly because this is what I program in on a day to day basis right now and as such I felt that it would be best in a specific project to create higher order functions which forced me to figure out how to do them. I couldn't find a whole lot of information online about creating them which is why I am writing this now.

So the example that we have below is a simple function that returns a function and is called immediately afterwards. The key is the dereferencing of the CODE() object which is what the return value is of function f().

sub f() {
  return sub { my $x = shift; return $x * $x; }
}
my $duplicate = f();
print &{$duplicate}(10) . "\n";


As shown above, we can see that the Lambda takes a single argument and multiplies itself by itself. We store that code piece in the variable $duplicate then dereferences and executes the code "&{...}" We're going to look at another example where we have a hash and for one element of the hash we store the new function.

my %tvar = {};
$tvar{'newfunc'} = f();
print $tvar{'newfunc'}(10) . "\n";


In the example shown below, we have a second order function accepting a function.

f2($) {
  print $_[0](10) . "\n";
}
f2(f());
my $newfunc = f();
f2($newfunc);


Notice that f2 takes a single argument. This argument is then treated as a function and is executed. We don't need to dereference when we call the execution of the parameter. The important thing to note here is that we can take the function that is passed to f2 and call it at some other point in the future.

Quick overview of Closures
Since Lambda functions are so closely intertwined with Closures; I'll give a quick overview of what a Closure is. This should give you a much better view of what you can really do with higher order functions. Closures are really simple in theory, a function that is considered a Lambda function can reference variables contained in the containing function. (Again, we are using a fake language resembling Javascript).

function f() {
  var x = 10;
  return function (y) { x *= y; return x; };
}
var newfun = f();
print newfun(10) . "\n";
print newfun(10) . "\n";
print newfun(10) . "\n";


Notice that x is defined within f() and is used within the returned Lambda function. This works because although x will be out of scope once f() completes; this is called a closure where the Lamdba function "closes over" the variable x within its scope with the variable x that is out of scope. Now I am not going to go into dynamic versus static binding but that might be something to look at on another post. Essentially, x lives on within the Lambda function and the only thing that will be able to access it will be the Lambda call.

So what happens when this runs? We end up with the output "100", "1000", "10000"; why? Because we can continue to operate on x as if it were a real variable that we had originally declared. How is this useful? Imagine if f() took a parameter that set x. Now we can set the starting position of x for the returned function. This allows us to limit the number of times that we would need to create a function explained next.

Why is all of this useful?
In the previous example, if we had added that parameter to f(); we can now create tons of functions by just changing the parameter being sent to f(). Let's look for example at the example below where we define f().

function f(x) {
  return function (y) { x *= y; return x; };
}
var start10 = f(10);
var start20 = f(20);
var start30 = f(30);

print start10(10) . "\n";
print start20(20) . "\n";
print start10(10) . "\n";
print start30(30) . "\n";
print start10(10) . "\n";


Notice that we just created 3 different functions where x {10,20,30}. Also note that the results of running the code above is "100","400","1000","900","10000". See how the x is its own in each new function, this gives us the ability to create many different functions by creating a higher order function which can generate our other functions.