My Pages

Wednesday, November 24, 2010

Higher Order Functions (in Perl)

What is a higher order function and why exactly would I use one? Well let's look first at what a higher order function is.

What is a higher order function?


There are many languages that support higher order functions (2nd order, 3rd order etc...) and many of them are pretty straight forward. Languages such as Ruby, Lisp, Scala, and Javascript fully support higher order functions. In languages such as Lisp they are normally referred to as Lambda functions. Now that we've been talking tons about different names and who supports higher order functions; let's look at the original question. What is a higher order function?

A higher order function is a function that does processing on another function. To be exact, depending on the order of the function will depend on whether IT is calling or returning a function. We'll look at both of these examples later, but for right now just know that there are many different orders of functions. Let's look at an example of a higher order function (NOTE: This is not code from an actual language but resembles Javascript)

Returning a function
function f() {
  return function (y) {
    return y * y;
  }
}
var newfunc = f();
var x = newfunc(10);




Passing a function to a function
function f(x) {
  return x(10);
}

function t(n) { return n * n; }

f(t);
f(function (n) { return n * n;});


Let's take apart both of these examples, first we see another function defined within f() without a name (this is known as a Lambda or anonymous function). The function itself does a very simple y * y calculation where y is defined as a parameter that the function takes when being executed. As you can see by the next portion, we execute this by running f() which returns the newly created function which we can then execute at any other time.

Looking at the second example, we can see that f(x) is defined as x(10). This means that we are assuming that x will be a function which takes at least one argument. Continuing in the code where can see where we created another function t(n) which takes a single argument and return n * n. As we can see, we execute f() by passing t as the function to execute. The last portion shows that we can actually pass a Lambda function to f(x) denoting that we do not actually have to have a full function to execute it.

Now, in both of these examples we've looked at 2nd order functions; this means that the functions we are returning and the functions we are accepting are 1st order functions themselves. 1st order functions do not accept or return any other functions, these are just your normal everyday functions. So what would it mean to have a 3rd order function? This would be a function that either accepts a function that accepts/returns a function or a function that returns a function that accepts/returns a function.

Examples in Perl
The first question you might be asking is why am I explaining this in Perl? Well mainly because this is what I program in on a day to day basis right now and as such I felt that it would be best in a specific project to create higher order functions which forced me to figure out how to do them. I couldn't find a whole lot of information online about creating them which is why I am writing this now.

So the example that we have below is a simple function that returns a function and is called immediately afterwards. The key is the dereferencing of the CODE() object which is what the return value is of function f().

sub f() {
  return sub { my $x = shift; return $x * $x; }
}
my $duplicate = f();
print &{$duplicate}(10) . "\n";


As shown above, we can see that the Lambda takes a single argument and multiplies itself by itself. We store that code piece in the variable $duplicate then dereferences and executes the code "&{...}" We're going to look at another example where we have a hash and for one element of the hash we store the new function.

my %tvar = {};
$tvar{'newfunc'} = f();
print $tvar{'newfunc'}(10) . "\n";


In the example shown below, we have a second order function accepting a function.

f2($) {
  print $_[0](10) . "\n";
}
f2(f());
my $newfunc = f();
f2($newfunc);


Notice that f2 takes a single argument. This argument is then treated as a function and is executed. We don't need to dereference when we call the execution of the parameter. The important thing to note here is that we can take the function that is passed to f2 and call it at some other point in the future.

Quick overview of Closures
Since Lambda functions are so closely intertwined with Closures; I'll give a quick overview of what a Closure is. This should give you a much better view of what you can really do with higher order functions. Closures are really simple in theory, a function that is considered a Lambda function can reference variables contained in the containing function. (Again, we are using a fake language resembling Javascript).

function f() {
  var x = 10;
  return function (y) { x *= y; return x; };
}
var newfun = f();
print newfun(10) . "\n";
print newfun(10) . "\n";
print newfun(10) . "\n";


Notice that x is defined within f() and is used within the returned Lambda function. This works because although x will be out of scope once f() completes; this is called a closure where the Lamdba function "closes over" the variable x within its scope with the variable x that is out of scope. Now I am not going to go into dynamic versus static binding but that might be something to look at on another post. Essentially, x lives on within the Lambda function and the only thing that will be able to access it will be the Lambda call.

So what happens when this runs? We end up with the output "100", "1000", "10000"; why? Because we can continue to operate on x as if it were a real variable that we had originally declared. How is this useful? Imagine if f() took a parameter that set x. Now we can set the starting position of x for the returned function. This allows us to limit the number of times that we would need to create a function explained next.

Why is all of this useful?
In the previous example, if we had added that parameter to f(); we can now create tons of functions by just changing the parameter being sent to f(). Let's look for example at the example below where we define f().

function f(x) {
  return function (y) { x *= y; return x; };
}
var start10 = f(10);
var start20 = f(20);
var start30 = f(30);

print start10(10) . "\n";
print start20(20) . "\n";
print start10(10) . "\n";
print start30(30) . "\n";
print start10(10) . "\n";


Notice that we just created 3 different functions where x {10,20,30}. Also note that the results of running the code above is "100","400","1000","900","10000". See how the x is its own in each new function, this gives us the ability to create many different functions by creating a higher order function which can generate our other functions.

No comments: