Code Monkey: 07/01/2011

Monday, July 25, 2011

(*function) pointers

Ok, now that we've covered some really cool things like pointers and double pointers; it's time to step things up a notch and look at function pointers. So let's start off with the two main things that come along with learning function pointers; what is a function pointer, and why do I care about a function pointer... So what IS a function pointer? Well let's think back to what exactly a pointer IS. A pointer is a storage of a memory location (remember from the previous post Pointers with *, &, ->, . ).

Now that we're all acclimated back into the world of pointers; or some are more confused now let's proceed! So the thing we need to keep in mind is how applications work. I'm really not going to get into all the theory that goes into program counters and heaps/stacks because it's really boring and actually causes more confusion than is necessary (yes, when working in pointer~land a certain level of confusion is ALWAYS necessary). I'm also not going to get debugging, as such I'm going to lay down some B.S. that looks cool but still makes sense.

So, functions exist as a point in memory as the start of an execution scope; I know what you're thinking, "what in the hell does that mean?" Well it's pretty straight forward, every function is a separate scope; this means that the variables that exist in function A are not available in function B (unless of course we pass them between each other). As such, there is always an entry point of the function, if you are familiar with gdb then you can see this happen with the "call" OP. If you didn't understand what that means don't worry it's not a huge deal for you; just know that there is a single point where the function starts, THIS is the pointer of the function. So let's assume this:

0x01 start function
0x02 do some work
0x03 do some more work
0x04 do even more work
0x05 return

So when we define a function pointer, it will take the memory address (remember points contain memory addresses) of the start of the function. From our example above, it will contain 0x01. So the question is, how does this actually work? Well, technically speaking it doesn't; OK you're a piece of shit, why are you telling me things that don't work!? Hold on, calm down; it's not that it doesn't work, it's that the compiler MAKES it work. See the compiler knows when you are using a function pointer; as such, the compiler knows to perform the function execution. It's not really imperative to know how this works, but more important that you know that the compiler understands that when you call the function pointer it is essentially the same as calling a normal function.

Ok, so tons of talking about what function pointers are, but how are function pointers defined? Well, it's pretty straight forward; the main thing you need to remember is that (*) is the definition of a function pointer; the rest looks like a normal function. Huh? That makes no sense; sure it does! Let's look at a function pointer, the function is defined as:

void function(int i, void *ptr);

So what does the function pointer look like? The name of our variable is 'var'

void (*var)(int, void *);

That's it? Yea, not that bad at all right? So how do I need to accept these types of pointers into another function? I mean, what happens if I wanted to accept a function that I wanted to call causing the execution to be determined at run time? The variable name of the function pointer is actually 'func'.

void my_function(void (*func)(int , void *));

Alright, now we're cooking with gas! So, here's the next question; how do I execute this damned thing!? Well there are two ways, there is the syntactically correct way to do this. Let's assume that we have an integer i, and of course our great friend NULL (although this could be any pointer). We'll assume we are executing the function 'ex' from our first example.

(*ex)(i, NULL);

So I mentioned that this was the syntactically correct way to execute the function pointer; is there a different way? Yes, actually there is! You can execute function pointers just like you would a normal function. WHAT? Why the hell would you teach me this crazy ass syntax!? Because, if I didn't you would see it somewhere and shit a brick trying to understand what in the hell it actually meant. We assume we have i as an integer again and now we're executing 'func'

func(i, NULL);

So much easier to understand! Yes, it is. So now that we understand the syntax; let's cover a few examples of why anyone would want to be able to do a function pointer. Well if you think about arrays, or linked lists (if you don't know about linked lists, just relate back to the array idea and stick with it). So, what happens if you want to perform a sorting routine (yes, I do realize that everyone and their brother has done this but hey; it assists in understanding the concept so calm down and listen up) but you want it to be flexible for every possible sorting style you wanted. What does this mean? Well let's say that I have a list of structs, I have a specific way I want to sort this array. Well, what happens if I want to sort it differently later; do I write a whole new function for sorting in this same method? NO! We use function pointers!

By using function pointers, we can pass a function pointer to the sorting routine that will be called which will be called on each element and will assist in sorting the elements of the array. What does this mean? Let's say we have a function definition:

void sort_my_shit(MyStruct *arr, int len, 
  int (*compare)(MyStruct *s1, MyStruct *s2));

That's freakin' AWESOME! But, I hate this syntax; is there anything easier to write? Of course, if you remember in C; there is this thing called typedef's. So let's look at our example that we've been checking out but with a typedef.

typedef void (*MyFunctionPointer)(int , void *);

What the hell, you just repeated that thing from the first example with the 'ex' function pointer! You are correct, but I did add the typedef to the beginning. Really this is it? How would I use this to define a function pointer? It's pretty simple, at this point you just use 'MyFunctionPointer' just like a normal pointer.

MyFunctionPointer our_function;

WHAT? Are you freakin' serious!? You've been making type in all this junk before when I could've just typed in one line and made life so much easier!? Yes, again; just trying to get you used to the syntax; because assholes like me like writing out long hand the function pointer definitions rather than the typedefs because I hate seeing a million different type names residing everywhere. Get used to it... How do you accept the function pointer?

void newFunction(MyFunctionPointer our_function);

OH MY GOD YOU ARE AN ASS!! Yes, I am; I've been showing you tons of difficult syntax but guess what; when you stumble across the long hand syntax you'll be glad that I brought you into it because now it makes a little more sense to you. If you think about the command/strategy/state patterns of OOP, you can actually implement these types of patterns THROUGH function pointers. Maybe I'll cover that in my next post; but for the time being check it out, you can start to make partial classes through utilization of function pointers.

Wait, I can make C more objective? Yes! Isn't that awesome; the only problem is that you have to do some work to get it to be more objective. Check this struct out.

typedef struct MyClass {
  void (*func)(struct MyClass *this, int var);
} MyClass;

Now you call the method (assuming that our variable for MyClass is cl and an integer i).

cl->func(cl, i);

AHHHHHHHHHH, HEAD 'SPLODE!!!!!! Calm down; this is one of those things that people freak out about; just think about this; if you actually do a gdb on a C++ application and break during the method of a class you will actually see the method being almost the SAME thing where this is actually a method parameter (that is automatically passed in by the compiler during execution). Pretty interesting right? I thought so; just remember, pointers are NOT that difficult to understand; that are hard to master.

Monday, July 11, 2011

Holy sh**t, double pointers!

So, double pointers; if you haven't read my previous post Pointers with *, &, ->, . and you don't already have a good understanding of pointers; I would really suggest going to read it. If you're a little rusty, you should probably go back and check it out anyway as it'll give you a good boost. Regardless, holy shit, double pointers what in the hell is a double pointer and why would I ever want one?

Now I could do the normal explanation and just say that it is a pointer to a pointer. But wait, what in the hell does that mean!? This is the big question that I had when I first got started in pointers. So let's see how to define a double pointer.

int  **x;

Really? That's it? It cannot be that simple; oh but it is. Now the fun comes; what happens if I dereference x? We get an int pointer. HUH!? I thought when we dereferenced a pointer we got the type back! Remember that when we dereference a pointer we get the variable back that the pointer referenced. Oh my god, here he goes again making no sense! STOP and take a deep breath. Think about it; what is a pointer? It's a variable that holds a memory address.

 _______________
| 8 bytes, s**t |
 ---------------
^ 0x01     0x09 ^

So let's assume that we have a variable x that is a pointer. That pointer contains the address 0x01; so when we try to dereference that pointer *x; we end up with the variable that exists AT 0x01. So where does x live? it IS a variable right? So that means that it MUST have a memory location!

 ______________
| 4 bytes, ugh |
 --------------
^ 0x11    0x15 ^

So let's assume that our variable x lives at 0x11. And inside of those 4 bytes is the memory location 0x01. Now here is the cool part; what if I wanted a reference TO x? Then you need a pointer to THAT pointer; and hence is born the double pointer. BRAIN EXPLOSION! You'll be fine, let's not get into the WHY would I do this; but rather the how does it work. So let's say we have a the following.

int  i = 10;
int  *x;
int  **y;

As we saw from my previous post, x = &i; means that *x and i are the same variable. This means that if we do *x = 20, then we know that i = 20. Now here is the mind blowing part, y = &x means that *y = x and **y = i. OH MY GOD!? Holy crap, how can that be? Remember each time you dereference a variable you treat the contents of the variable as a variable itself. The only reason that we know it is an integer is because we define the pointer as an int *.

Ok, so that makes sense; I can have a pointer that points to another pointer. But the big question is WHY? Why would I ever care to do this!? Well there are two main reasons; the first and biggest reason is multidimensional arrays, another is to create iterators, and also to create variables. Remember in the previous post we said what? Pointers are used for dynamic arrays. So having a dynamic array of pointers is a dynamic array of dynamic arrays. Let's look at this.


        V 0x01  V 0x09
         _______
LAYER 1 | 0 | 1 |
         -------
        V-^   ^-----V
         _______     _______
LAYER 2 | 0 | 1 |   | 0 | 1 |
         -------     -------
        ^ 0x09      ^ 0x17

So I denoted these as layers, essentially you are dealing with a variable such as var[LAYER1][LAYER2]. This means that you have an array of 2 members; each of those members containing an array of 2 members.

So what does the memory layout look like? Well for the variable var, we are actually a pointer that contains the memory address 0x01. Here we can reference either member 1 var[0] or member 2 var[1]. Each of these are pointers that contain memory addresses. For example, if we reference var[0] we have a reference to member 1 var[0][0] or member 2 var[0][1]. OH SHIT!! This is freakin' awesome; being able to create dynamic multi-dimensional arrays is freakin' fantastic!

But wait... There's more! So what about iterators. The biggest way to show this is through an iteration over a character array. Remember that char *'s are C versions of strings. So now, what happens if we wanted to send a character pointer and move to the next space. So we take a pointer to the head of the char *, then we pass the memory location of the new variable to our function.

char  *c = "Oh damn, this sucks!";
char  *cp = c;

Remember that cp will just hold a pointer to the variable c. So let's look at two things, first is the calling of the function.

next_space(&cp);

As well as the function definition.

void  next_space(char **p) {
  for( ; **p != ' ' && **p != '\0'; (*p)++ ) { }
}

HOLY SHIT! Wait a second though, why does (*p)++ work? Easy, we dereference from the double pointer p, and we increment the original variable cp from the caller. How does **p become a character? Oh right; because * dereferences a variable, then we end up with the original cp[0] variable since it is the same as **p!

So you had said creating variables; how does that work? Oh yes, this is pretty cool, terrible stuff! I say cool because you can create API's that utilize them quite well; terrible because if you don't do it right or if people don't understand what you are trying to do they can create HUGE memory leaks. Let's look at an example. We have a pointer char * because we want to store a string.

char  *c;

Now let's assume that have a general string that we want to store into c, we have a function store_string() that we call like below.

store_string(&c, "Hello World");

So now let's look at the store_string definition.

void  store_string(char  **ptr, char  *str) {
  *ptr = malloc(strlen(str) * sizeof(char));
  int i = 0;
  for(i = 0; i <= strlen(str); i++) {
    (*ptr)[i] = str[i];
  }
}

OH SHIT!? Yea that's right! Now here is the best part; we can clean up the variable and set it to NULL FOR THE CALLER!!

clear_string(&c);

So what could this awesomeness look like!?

void  clear_string(char  **ptr) {
  free(*ptr);
  *ptr = NULL;
}

OH MAN THAT IS THE GREATEST SH**T EVER! Yea, yea; here is the key to double pointers and of course ANY pointers is knowing when and where to use them. I've seen double pointers used in some of the worst places ever; as well as pointers. Remember the less you deal with allocating the memory for the pointers; the less likely you are to create large memory leaks! And don't expect me to feel sorry for you if you try to create a "cool" API that takes in a double pointer and generates your multi-dimensional array and you end up loosing tons of memory. Remember, YOU NEED TO FREE EACH OF YOUR POINTERS THAT YOUR POINTERS REFER TOO!

My point of saying this isn't to scare you; but to ensure that you are thinking up a good use case for creating the double pointer!

DISCLAIMER
I am not going to cover single/double void pointers; if you want to use void pointers you need to learn how to do casting and you really need to figure out a way to determine what datatype you are dealing with in the void pointer. Why? Because you cannot dereference a void pointer! This means if you have void *ptr; with an array of int's you cannot say ptr[x]. Why? Because you cannot dereference the void *, because void has no size so there is no size for the offset of ptr. If you said ((int *)ptr)[x] then it will work because you've changed the data type from (void *) to (int *). We will cover pseudo-inheritance in C probably in the next post.

I find that I am doing better posting when I am drunk; I find that I am much more outgoing and much more social when I am drunk. As such, I'll be blogging while I'm (at least partially) drunk so that I can make the information fun and exciting! If you find that I am boring while I'm drunk; please let me know and I'll be sure to try harder on the next blog!

Friday, July 8, 2011

Pointers with *, &, ->, .

This is one of those things that never gets old; discussing pointers. With my heavy C background and my insatiable desire to program close to the hardware; I would say that I love pointers. But most people, when confronted with pointers, tend to loose interest and also their minds trying to understand what they mean. So here, I'm going to try my best at explaining pointers in a way that makes sense; for this I'll be explaining pointers from the reference of C.

First let's cover some background on variables, definition is pretty straight forward. Below you can see how to create an int variable.

int  x;

So let's get into the thick of it, what is a pointer? A pointer is a variable to a memory location! Wait, that's it? Is it possible that I am misconstruing what a pointer is? Not at all, instead the confusion comes when you begin explaining how to USE them. So let's see how to define a int pointer.

int  *x;

Yes, that is it, I've defined a pointer; the variable is interpreted as such; x is a variable that contains a memory location to an int. But wait, that cannot be it; people tell me that pointers are really difficult to use. Yes, it is a little bit harder, see you have to dereference the variable x to get the integer that is pointed to by the memory location stored in x. Ok, so now your eyes have glazed over right? You've decided to continue on to some other blog and see if someone else can explain it better but STOP and take a deep breath.

Let's look at a better example from our pointer variable x. Let's assume that we have a section of memory that is the memory representation of an int (4 bytes long).

 ---------------
| 4 bytes, woot |
 ---------------

Let's also put some byte numbers to this.

 ---------------
| 4 bytes, woot |
 ---------------
^ 0x1       0x5 ^

Ok, ok so what about our int *x? What the hell does that even mean and what is contained in there? Well, are you ready? It contains the address of our int (the 4 bytes above). So wait, x = 0x1? That's right! But now you have to "dereference" x. So bare with me; you have x which contains the memory location of our int. We need to get THE int. Therefore we need to get the data AT the memory location right? HA, that is dereferencing. So how do we do it?

*x=10;

No, it cannot be that simple... Yes, actually it is! The * tells the compiler to treat the variable as a pointer and to use the current variable as a memory location. So how do we know that it is an int? Aha! Because we defined it as an "int *" which means that inherently when we dereference x, *x must be an int. So now that we understand that the * indicates a reference and is used to dereference a pointer; what the hell is that ampersand '&' used for?

Well, the ampersand is used to indicate "reference of." Wait, I thought that was what * was used for. No, I said * was to indicate a variable is a reference or is used to dereference a reference. The & is used to say the memory location (reference) of a variable. Let's see this from our original variable x (I shall refresh your memory).

int  x;

See how x is just a nice happy variable? Not a pointer! So let's look at some more memory locations. It is important to note that since x is not a pointer, the memory layout below IS the x.

 -------------------
| 4 bytes, oh noes! |
 -------------------
^ x IS here

So when you specify &x, what are you saying? You're getting the memory location OF x! WHAT!?!?! That doesn't make any sense! how can you GET the memory location; yea I know you want to switch blogs again STOP and let's take a deep breath.

Remember that x IS a variable and therefore the variable MUST lie in a memory location. So when you do &x you are merely getting the memory location of that variable.

 -------------------
| 4 bytes, oh noes! |
 -------------------
^ 0x1           0x5 ^

So why would I want to do this? Because in order to create a pointer, it must reference a valid memory location. Check this out.

int  x;
int  *y;
y = &x;

OH MY GOD, MY HEAD EXPLODED! Nah, it's not that bad; until of course I tell you that *y and x are the same variable. If you don't understand that, then check it out, let's say that x is at location 0x1 (taking from our previous example). Then when we get the variable location 0x1 and store it into a variable that HOLDS the memory location (y), then when we dereference (there is that stupid ass word again!) *y we end up with an int at the location 0x1. OH SHIT, THAT IS THE LOCATION OF x!!!! GET SOME!

So what happens to x if I do *y = 22? Right, x = 22! Why is that? Because *y and x are THE SAME VARIABLE! I'll say it again if you want me to but I'm sure you're getting tire of me explaining it; so I'll say this if basic pointers don't make sense, you should start reading this over again (if you don't understand iteration, you will by the end of this).

Fantastic, so now we understand the basics of pointers. But what about those stupid ->'s and when can I use .'s? Ok, we're getting ahead of ourselves. So if you know about structs (if you're doing anything in C I'm making the assumption you've heard of them) but if not let's look at an example.

typedef struct str{
  int x;
}str;

So here we have a basic struct; essentially it's like a class with only public members (and no methods) in Java. If you don't know what a class or a struct is; you shouldn't be messing around with pointers. So let's look at referencing the variable x inside of an example str.

str s;
s.x = 10;

Well that wasn't too bad; you use the . operator to indicate a member of the str struct. But here is the thing; let's say that you wanted s to be a pointer. Why? Well, we'll get to that; but right now just follow (I haven't steered you wrong yet!). Let's assume that our variable s is declared as below.

str *s;

Then we want to access x. Ok, simple right, s.x; no... s->x; oh holy crap you're confusing me! Nah, it's really not that bad, s was a pointer right? So what made you think that you could just access x as if s was the variable!? Oh I get it, s is a pointer, which means we had to dereference it to get x. So that means that -> dereferences s and gets the variable x from the struct! Right! if you wanted to write it (and seem cool) you could do (*s).x but no one would hire you because that would be horrible to read everyday.

So wait, that's all the -> operator means? Yes! See, I told you pointers weren't that bad. But of course, there is the big question; how do I actually USE pointers? Dynamic array sizes. What does that have to do with anything? Easy, the function malloc() which allows you allocate a section of memory. Let's say I want to get an array of 2 integers.

int * x;
x = malloc(sizeof(int) * 2);

If you don't know malloc, you'll need to go check out the documentation but essentially we are just allocating the size of two integers (8 bytes) in memory and getting the memory location. Storing it into a pointer. What does this look like in memory?

 ------------------
| 8 bytes, badness |
 ------------------
^ 0x1          0x9 ^

And of course x holds 0x1. Now, since we allocated 2 ints it is important to note that THERE ARE TWO OF THEM. So how do we get them? i[0] references the first i[1] indicates the second. WAIT, huh? Ok, so i[0] dereferences i and then goes to an offset of 0. i[1] dereferences i and then goes to an offset of 1. So where does this offset come from? Easy, the offset is defined by the SIZE OF THE VARIABLE TYPE! Since our variable type is int, the size is 4 bytes; so i[1] will go to 0x1 + 4 = 0x5.

So can I name another reason to use them? Sure! If you wanted to modify a variable without having to return it. What? That makes no sense. Sure it does, let's say we want to take in a str variable (again stealing the definition from above).

void modify_str(str * s) {
  s->x = 10;
}

OH SHIT! That makes sense, yes! Is that awesome or what!? So here is the cool part; returning pointers. Example, let's say that we want to create a new str, and set the internal x to 1983 (my birth year because I'm picking random numbers).

str *new_str(int  i) {
  str *s = malloc(sizeof(str));
  s->x = i;
  return s;
}

OH DAMN! This makes initializations work! All of this is starting to make sense; why we would need to use pointers, how pointers work, and also how to effectively use pointers. Yes, and it really isn't as bad as many people tend to think. So here is the thing, pointers are pretty straight forward; double pointers and higher become much more abstract.

DISCLAIMER
I am not going to cover double pointers; but since you understand pointers now; just think of them as pointers to pointers (since that is what they actually are). And more importantly, you really won't use double pointers until you start needing to do pointer arithmetic in order to do iterations.

I'll say that I'll cover double pointers at a later time; but I have no guarantees that it will happen. If it does, hopefully I will be as fun and exciting as I have been so far! Just remember, pointers store memory locations of variables.

Code Monkey

My Pages