My Pages

Friday, July 8, 2011

Pointers with *, &, ->, .

This is one of those things that never gets old; discussing pointers. With my heavy C background and my insatiable desire to program close to the hardware; I would say that I love pointers. But most people, when confronted with pointers, tend to loose interest and also their minds trying to understand what they mean. So here, I'm going to try my best at explaining pointers in a way that makes sense; for this I'll be explaining pointers from the reference of C.

First let's cover some background on variables, definition is pretty straight forward. Below you can see how to create an int variable.

int  x;


So let's get into the thick of it, what is a pointer? A pointer is a variable to a memory location! Wait, that's it? Is it possible that I am misconstruing what a pointer is? Not at all, instead the confusion comes when you begin explaining how to USE them. So let's see how to define a int pointer.

int  *x;


Yes, that is it, I've defined a pointer; the variable is interpreted as such; x is a variable that contains a memory location to an int. But wait, that cannot be it; people tell me that pointers are really difficult to use. Yes, it is a little bit harder, see you have to dereference the variable x to get the integer that is pointed to by the memory location stored in x. Ok, so now your eyes have glazed over right? You've decided to continue on to some other blog and see if someone else can explain it better but STOP and take a deep breath.

Let's look at a better example from our pointer variable x. Let's assume that we have a section of memory that is the memory representation of an int (4 bytes long).

 ---------------
| 4 bytes, woot |
---------------


Let's also put some byte numbers to this.

 ---------------
| 4 bytes, woot |
---------------
^ 0x1 0x5 ^


Ok, ok so what about our int *x? What the hell does that even mean and what is contained in there? Well, are you ready? It contains the address of our int (the 4 bytes above). So wait, x = 0x1? That's right! But now you have to "dereference" x. So bare with me; you have x which contains the memory location of our int. We need to get THE int. Therefore we need to get the data AT the memory location right? HA, that is dereferencing. So how do we do it?

*x=10;


No, it cannot be that simple... Yes, actually it is! The * tells the compiler to treat the variable as a pointer and to use the current variable as a memory location. So how do we know that it is an int? Aha! Because we defined it as an "int *" which means that inherently when we dereference x, *x must be an int. So now that we understand that the * indicates a reference and is used to dereference a pointer; what the hell is that ampersand '&' used for?

Well, the ampersand is used to indicate "reference of." Wait, I thought that was what * was used for. No, I said * was to indicate a variable is a reference or is used to dereference a reference. The & is used to say the memory location (reference) of a variable. Let's see this from our original variable x (I shall refresh your memory).

int  x;


See how x is just a nice happy variable? Not a pointer! So let's look at some more memory locations. It is important to note that since x is not a pointer, the memory layout below IS the x.

 -------------------
| 4 bytes, oh noes! |
-------------------
^ x IS here


So when you specify &x, what are you saying? You're getting the memory location OF x! WHAT!?!?! That doesn't make any sense! how can you GET the memory location; yea I know you want to switch blogs again STOP and let's take a deep breath.

Remember that x IS a variable and therefore the variable MUST lie in a memory location. So when you do &x you are merely getting the memory location of that variable.

 -------------------
| 4 bytes, oh noes! |
-------------------
^ 0x1 0x5 ^


So why would I want to do this? Because in order to create a pointer, it must reference a valid memory location. Check this out.

int  x;
int *y;
y = &x;


OH MY GOD, MY HEAD EXPLODED! Nah, it's not that bad; until of course I tell you that *y and x are the same variable. If you don't understand that, then check it out, let's say that x is at location 0x1 (taking from our previous example). Then when we get the variable location 0x1 and store it into a variable that HOLDS the memory location (y), then when we dereference (there is that stupid ass word again!) *y we end up with an int at the location 0x1. OH SHIT, THAT IS THE LOCATION OF x!!!! GET SOME!

So what happens to x if I do *y = 22? Right, x = 22! Why is that? Because *y and x are THE SAME VARIABLE! I'll say it again if you want me to but I'm sure you're getting tire of me explaining it; so I'll say this if basic pointers don't make sense, you should start reading this over again (if you don't understand iteration, you will by the end of this).

Fantastic, so now we understand the basics of pointers. But what about those stupid ->'s and when can I use .'s? Ok, we're getting ahead of ourselves. So if you know about structs (if you're doing anything in C I'm making the assumption you've heard of them) but if not let's look at an example.

typedef struct str{
int x;
}str;


So here we have a basic struct; essentially it's like a class with only public members (and no methods) in Java. If you don't know what a class or a struct is; you shouldn't be messing around with pointers. So let's look at referencing the variable x inside of an example str.

str s;
s.x = 10;


Well that wasn't too bad; you use the . operator to indicate a member of the str struct. But here is the thing; let's say that you wanted s to be a pointer. Why? Well, we'll get to that; but right now just follow (I haven't steered you wrong yet!). Let's assume that our variable s is declared as below.

str *s;


Then we want to access x. Ok, simple right, s.x; no... s->x; oh holy crap you're confusing me! Nah, it's really not that bad, s was a pointer right? So what made you think that you could just access x as if s was the variable!? Oh I get it, s is a pointer, which means we had to dereference it to get x. So that means that -> dereferences s and gets the variable x from the struct! Right! if you wanted to write it (and seem cool) you could do (*s).x but no one would hire you because that would be horrible to read everyday.

So wait, that's all the -> operator means? Yes! See, I told you pointers weren't that bad. But of course, there is the big question; how do I actually USE pointers? Dynamic array sizes. What does that have to do with anything? Easy, the function malloc() which allows you allocate a section of memory. Let's say I want to get an array of 2 integers.

int * x;
x = malloc(sizeof(int) * 2);


If you don't know malloc, you'll need to go check out the documentation but essentially we are just allocating the size of two integers (8 bytes) in memory and getting the memory location. Storing it into a pointer. What does this look like in memory?

 ------------------
| 8 bytes, badness |
------------------
^ 0x1 0x9 ^


And of course x holds 0x1. Now, since we allocated 2 ints it is important to note that THERE ARE TWO OF THEM. So how do we get them? i[0] references the first i[1] indicates the second. WAIT, huh? Ok, so i[0] dereferences i and then goes to an offset of 0. i[1] dereferences i and then goes to an offset of 1. So where does this offset come from? Easy, the offset is defined by the SIZE OF THE VARIABLE TYPE! Since our variable type is int, the size is 4 bytes; so i[1] will go to 0x1 + 4 = 0x5.

So can I name another reason to use them? Sure! If you wanted to modify a variable without having to return it. What? That makes no sense. Sure it does, let's say we want to take in a str variable (again stealing the definition from above).

void modify_str(str * s) {
s->x = 10;
}


OH SHIT! That makes sense, yes! Is that awesome or what!? So here is the cool part; returning pointers. Example, let's say that we want to create a new str, and set the internal x to 1983 (my birth year because I'm picking random numbers).

str *new_str(int  i) {
str *s = malloc(sizeof(str));
s->x = i;
return s;
}


OH DAMN! This makes initializations work! All of this is starting to make sense; why we would need to use pointers, how pointers work, and also how to effectively use pointers. Yes, and it really isn't as bad as many people tend to think. So here is the thing, pointers are pretty straight forward; double pointers and higher become much more abstract.

DISCLAIMER
I am not going to cover double pointers; but since you understand pointers now; just think of them as pointers to pointers (since that is what they actually are). And more importantly, you really won't use double pointers until you start needing to do pointer arithmetic in order to do iterations.

I'll say that I'll cover double pointers at a later time; but I have no guarantees that it will happen. If it does, hopefully I will be as fun and exciting as I have been so far! Just remember, pointers store memory locations of variables.