Code Monkey: 2011

Thursday, October 20, 2011

Computer Science = Math

Today when I was at work, I realized that I needed to draw a string at the mid-point of a 2d line. So of course, I go and google mid-point of a line and get the equation below.

Once I had this formula I plugged it in; but of course I wanted it in a different place. So, what if I wanted to place it at the quarter mark? So I searched but found nothing. Now it was up to me to figure out the mathematics behind it. Well, it's pretty simple right? Since X2 is the X for the end point; if we plug-in the previous equation into X2 clearly we can get the midpoint of the midpoint. This is as shown below.

Of course, this is a nightmare to look at; we need to simplify the equation. We'll be looking at this for only the x parameter since I don't feel like showing both x and y since they are pretty much the same. The first thing to do is to split the equation a bit so that we can begin to simplify.

Now, we bring down the fraction on top of the second fraction and combine the two denominators.

Now we need to find a common denominator (easy to see is 4) so we need to look at the first fraction and get the denominator to 4. To do this we can multiple the denominator by 2 (in order to do this we must multiple by 1 so this means we have to multiple the whole thing by (2/2))

Now we combine the two fractions which leaves us with the single fraction below.

At this point we just combine the parameters to end up with the final equation for finding the mid-point of a mid-point.

Now here is the thing; what if I wanted to get the midpoint of the midpoint of the midpoint? Well we start off with the main function.

Again, we split the fraction in half, bring down the second fraction and find a common denominator.

Now that we have a common denominator, we can go ahead and combine terms to get the final fraction.

So here was the coolest part, do you see that there is a pattern? Below you see the final fraction; n is the number of divisions you want of the line. 1 is the point at 1/2 the line; 2 is the point at 1/4 the line; etc...

So what does all of this mean? Why am I bringing math into this blog? Because sometimes we as computer scientists need to be reminded that mathematics is important in our day to day lives. I've met way to many developers who consider themselves programmers; however, if there isn't already an equation to express something that they need they just kind of assume that because no one has done it then it's not a worthwhile thing to do. Either this, or I'll see them write some really terrible way to perform this.

So for our example above, the fastest way to figure out how to get to the point of the line is to figure out the m (rise/run) and just go from point A to what they think should be point B. Then call it good enough. But if you're just doing that, maybe you need to rethink what you are actually doing.

Another thing, mathematics supports recursion in every way. Algorithms much simpler to express recursively; however, most programmers will stray away from recursive algorithms. As a matter of fact, the main complaint that I hear is that they are just too difficult and "make my brain hurt" to think of how to implement them. Maybe I'll write a blog about recursive algorithms; just to show how easy they are to actually write. The main reason that they were never better than iteration is that they required "calls" into methods/functions. However, in many newer languages; recursion is supported and is actually much better and stabler than iteration. This is usually due to values vs. variables which I should cover again at a later date.

My point is that you should never actually be scared of algorithms or doing mathematical work. If you do, you are narrowing yourself down to a typist that is paid to implement other peoples ideas. As functional programming becomes a larger paradigm, it will become important to learn how to actually work around in that paradigm. If you don't believe me; go checkout erlang and try to write a very simple application. Not a "hello, world" but instead what about something with an accumulator? What about a general for loop? You'll quickly see that recursion is an important part of functional programming life.

Monday, July 25, 2011

(*function) pointers

Ok, now that we've covered some really cool things like pointers and double pointers; it's time to step things up a notch and look at function pointers. So let's start off with the two main things that come along with learning function pointers; what is a function pointer, and why do I care about a function pointer... So what IS a function pointer? Well let's think back to what exactly a pointer IS. A pointer is a storage of a memory location (remember from the previous post Pointers with *, &, ->, . ).

Now that we're all acclimated back into the world of pointers; or some are more confused now let's proceed! So the thing we need to keep in mind is how applications work. I'm really not going to get into all the theory that goes into program counters and heaps/stacks because it's really boring and actually causes more confusion than is necessary (yes, when working in pointer~land a certain level of confusion is ALWAYS necessary). I'm also not going to get debugging, as such I'm going to lay down some B.S. that looks cool but still makes sense.

So, functions exist as a point in memory as the start of an execution scope; I know what you're thinking, "what in the hell does that mean?" Well it's pretty straight forward, every function is a separate scope; this means that the variables that exist in function A are not available in function B (unless of course we pass them between each other). As such, there is always an entry point of the function, if you are familiar with gdb then you can see this happen with the "call" OP. If you didn't understand what that means don't worry it's not a huge deal for you; just know that there is a single point where the function starts, THIS is the pointer of the function. So let's assume this:

0x01 start function
0x02 do some work
0x03 do some more work
0x04 do even more work
0x05 return

So when we define a function pointer, it will take the memory address (remember points contain memory addresses) of the start of the function. From our example above, it will contain 0x01. So the question is, how does this actually work? Well, technically speaking it doesn't; OK you're a piece of shit, why are you telling me things that don't work!? Hold on, calm down; it's not that it doesn't work, it's that the compiler MAKES it work. See the compiler knows when you are using a function pointer; as such, the compiler knows to perform the function execution. It's not really imperative to know how this works, but more important that you know that the compiler understands that when you call the function pointer it is essentially the same as calling a normal function.

Ok, so tons of talking about what function pointers are, but how are function pointers defined? Well, it's pretty straight forward; the main thing you need to remember is that (*) is the definition of a function pointer; the rest looks like a normal function. Huh? That makes no sense; sure it does! Let's look at a function pointer, the function is defined as:

void function(int i, void *ptr);

So what does the function pointer look like? The name of our variable is 'var'

void (*var)(int, void *);

That's it? Yea, not that bad at all right? So how do I need to accept these types of pointers into another function? I mean, what happens if I wanted to accept a function that I wanted to call causing the execution to be determined at run time? The variable name of the function pointer is actually 'func'.

void my_function(void (*func)(int , void *));

Alright, now we're cooking with gas! So, here's the next question; how do I execute this damned thing!? Well there are two ways, there is the syntactically correct way to do this. Let's assume that we have an integer i, and of course our great friend NULL (although this could be any pointer). We'll assume we are executing the function 'ex' from our first example.

(*ex)(i, NULL);

So I mentioned that this was the syntactically correct way to execute the function pointer; is there a different way? Yes, actually there is! You can execute function pointers just like you would a normal function. WHAT? Why the hell would you teach me this crazy ass syntax!? Because, if I didn't you would see it somewhere and shit a brick trying to understand what in the hell it actually meant. We assume we have i as an integer again and now we're executing 'func'

func(i, NULL);

So much easier to understand! Yes, it is. So now that we understand the syntax; let's cover a few examples of why anyone would want to be able to do a function pointer. Well if you think about arrays, or linked lists (if you don't know about linked lists, just relate back to the array idea and stick with it). So, what happens if you want to perform a sorting routine (yes, I do realize that everyone and their brother has done this but hey; it assists in understanding the concept so calm down and listen up) but you want it to be flexible for every possible sorting style you wanted. What does this mean? Well let's say that I have a list of structs, I have a specific way I want to sort this array. Well, what happens if I want to sort it differently later; do I write a whole new function for sorting in this same method? NO! We use function pointers!

By using function pointers, we can pass a function pointer to the sorting routine that will be called which will be called on each element and will assist in sorting the elements of the array. What does this mean? Let's say we have a function definition:

void sort_my_shit(MyStruct *arr, int len, 
  int (*compare)(MyStruct *s1, MyStruct *s2));

That's freakin' AWESOME! But, I hate this syntax; is there anything easier to write? Of course, if you remember in C; there is this thing called typedef's. So let's look at our example that we've been checking out but with a typedef.

typedef void (*MyFunctionPointer)(int , void *);

What the hell, you just repeated that thing from the first example with the 'ex' function pointer! You are correct, but I did add the typedef to the beginning. Really this is it? How would I use this to define a function pointer? It's pretty simple, at this point you just use 'MyFunctionPointer' just like a normal pointer.

MyFunctionPointer our_function;

WHAT? Are you freakin' serious!? You've been making type in all this junk before when I could've just typed in one line and made life so much easier!? Yes, again; just trying to get you used to the syntax; because assholes like me like writing out long hand the function pointer definitions rather than the typedefs because I hate seeing a million different type names residing everywhere. Get used to it... How do you accept the function pointer?

void newFunction(MyFunctionPointer our_function);

OH MY GOD YOU ARE AN ASS!! Yes, I am; I've been showing you tons of difficult syntax but guess what; when you stumble across the long hand syntax you'll be glad that I brought you into it because now it makes a little more sense to you. If you think about the command/strategy/state patterns of OOP, you can actually implement these types of patterns THROUGH function pointers. Maybe I'll cover that in my next post; but for the time being check it out, you can start to make partial classes through utilization of function pointers.

Wait, I can make C more objective? Yes! Isn't that awesome; the only problem is that you have to do some work to get it to be more objective. Check this struct out.

typedef struct MyClass {
  void (*func)(struct MyClass *this, int var);
} MyClass;

Now you call the method (assuming that our variable for MyClass is cl and an integer i).

cl->func(cl, i);

AHHHHHHHHHH, HEAD 'SPLODE!!!!!! Calm down; this is one of those things that people freak out about; just think about this; if you actually do a gdb on a C++ application and break during the method of a class you will actually see the method being almost the SAME thing where this is actually a method parameter (that is automatically passed in by the compiler during execution). Pretty interesting right? I thought so; just remember, pointers are NOT that difficult to understand; that are hard to master.

Monday, July 11, 2011

Holy sh**t, double pointers!

So, double pointers; if you haven't read my previous post Pointers with *, &, ->, . and you don't already have a good understanding of pointers; I would really suggest going to read it. If you're a little rusty, you should probably go back and check it out anyway as it'll give you a good boost. Regardless, holy shit, double pointers what in the hell is a double pointer and why would I ever want one?

Now I could do the normal explanation and just say that it is a pointer to a pointer. But wait, what in the hell does that mean!? This is the big question that I had when I first got started in pointers. So let's see how to define a double pointer.

int  **x;

Really? That's it? It cannot be that simple; oh but it is. Now the fun comes; what happens if I dereference x? We get an int pointer. HUH!? I thought when we dereferenced a pointer we got the type back! Remember that when we dereference a pointer we get the variable back that the pointer referenced. Oh my god, here he goes again making no sense! STOP and take a deep breath. Think about it; what is a pointer? It's a variable that holds a memory address.

 _______________
| 8 bytes, s**t |
 ---------------
^ 0x01     0x09 ^

So let's assume that we have a variable x that is a pointer. That pointer contains the address 0x01; so when we try to dereference that pointer *x; we end up with the variable that exists AT 0x01. So where does x live? it IS a variable right? So that means that it MUST have a memory location!

 ______________
| 4 bytes, ugh |
 --------------
^ 0x11    0x15 ^

So let's assume that our variable x lives at 0x11. And inside of those 4 bytes is the memory location 0x01. Now here is the cool part; what if I wanted a reference TO x? Then you need a pointer to THAT pointer; and hence is born the double pointer. BRAIN EXPLOSION! You'll be fine, let's not get into the WHY would I do this; but rather the how does it work. So let's say we have a the following.

int  i = 10;
int  *x;
int  **y;

As we saw from my previous post, x = &i; means that *x and i are the same variable. This means that if we do *x = 20, then we know that i = 20. Now here is the mind blowing part, y = &x means that *y = x and **y = i. OH MY GOD!? Holy crap, how can that be? Remember each time you dereference a variable you treat the contents of the variable as a variable itself. The only reason that we know it is an integer is because we define the pointer as an int *.

Ok, so that makes sense; I can have a pointer that points to another pointer. But the big question is WHY? Why would I ever care to do this!? Well there are two main reasons; the first and biggest reason is multidimensional arrays, another is to create iterators, and also to create variables. Remember in the previous post we said what? Pointers are used for dynamic arrays. So having a dynamic array of pointers is a dynamic array of dynamic arrays. Let's look at this.


        V 0x01  V 0x09
         _______
LAYER 1 | 0 | 1 |
         -------
        V-^   ^-----V
         _______     _______
LAYER 2 | 0 | 1 |   | 0 | 1 |
         -------     -------
        ^ 0x09      ^ 0x17

So I denoted these as layers, essentially you are dealing with a variable such as var[LAYER1][LAYER2]. This means that you have an array of 2 members; each of those members containing an array of 2 members.

So what does the memory layout look like? Well for the variable var, we are actually a pointer that contains the memory address 0x01. Here we can reference either member 1 var[0] or member 2 var[1]. Each of these are pointers that contain memory addresses. For example, if we reference var[0] we have a reference to member 1 var[0][0] or member 2 var[0][1]. OH SHIT!! This is freakin' awesome; being able to create dynamic multi-dimensional arrays is freakin' fantastic!

But wait... There's more! So what about iterators. The biggest way to show this is through an iteration over a character array. Remember that char *'s are C versions of strings. So now, what happens if we wanted to send a character pointer and move to the next space. So we take a pointer to the head of the char *, then we pass the memory location of the new variable to our function.

char  *c = "Oh damn, this sucks!";
char  *cp = c;

Remember that cp will just hold a pointer to the variable c. So let's look at two things, first is the calling of the function.

next_space(&cp);

As well as the function definition.

void  next_space(char **p) {
  for( ; **p != ' ' && **p != '\0'; (*p)++ ) { }
}

HOLY SHIT! Wait a second though, why does (*p)++ work? Easy, we dereference from the double pointer p, and we increment the original variable cp from the caller. How does **p become a character? Oh right; because * dereferences a variable, then we end up with the original cp[0] variable since it is the same as **p!

So you had said creating variables; how does that work? Oh yes, this is pretty cool, terrible stuff! I say cool because you can create API's that utilize them quite well; terrible because if you don't do it right or if people don't understand what you are trying to do they can create HUGE memory leaks. Let's look at an example. We have a pointer char * because we want to store a string.

char  *c;

Now let's assume that have a general string that we want to store into c, we have a function store_string() that we call like below.

store_string(&c, "Hello World");

So now let's look at the store_string definition.

void  store_string(char  **ptr, char  *str) {
  *ptr = malloc(strlen(str) * sizeof(char));
  int i = 0;
  for(i = 0; i <= strlen(str); i++) {
    (*ptr)[i] = str[i];
  }
}

OH SHIT!? Yea that's right! Now here is the best part; we can clean up the variable and set it to NULL FOR THE CALLER!!

clear_string(&c);

So what could this awesomeness look like!?

void  clear_string(char  **ptr) {
  free(*ptr);
  *ptr = NULL;
}

OH MAN THAT IS THE GREATEST SH**T EVER! Yea, yea; here is the key to double pointers and of course ANY pointers is knowing when and where to use them. I've seen double pointers used in some of the worst places ever; as well as pointers. Remember the less you deal with allocating the memory for the pointers; the less likely you are to create large memory leaks! And don't expect me to feel sorry for you if you try to create a "cool" API that takes in a double pointer and generates your multi-dimensional array and you end up loosing tons of memory. Remember, YOU NEED TO FREE EACH OF YOUR POINTERS THAT YOUR POINTERS REFER TOO!

My point of saying this isn't to scare you; but to ensure that you are thinking up a good use case for creating the double pointer!

DISCLAIMER
I am not going to cover single/double void pointers; if you want to use void pointers you need to learn how to do casting and you really need to figure out a way to determine what datatype you are dealing with in the void pointer. Why? Because you cannot dereference a void pointer! This means if you have void *ptr; with an array of int's you cannot say ptr[x]. Why? Because you cannot dereference the void *, because void has no size so there is no size for the offset of ptr. If you said ((int *)ptr)[x] then it will work because you've changed the data type from (void *) to (int *). We will cover pseudo-inheritance in C probably in the next post.

I find that I am doing better posting when I am drunk; I find that I am much more outgoing and much more social when I am drunk. As such, I'll be blogging while I'm (at least partially) drunk so that I can make the information fun and exciting! If you find that I am boring while I'm drunk; please let me know and I'll be sure to try harder on the next blog!

Friday, July 8, 2011

Pointers with *, &, ->, .

This is one of those things that never gets old; discussing pointers. With my heavy C background and my insatiable desire to program close to the hardware; I would say that I love pointers. But most people, when confronted with pointers, tend to loose interest and also their minds trying to understand what they mean. So here, I'm going to try my best at explaining pointers in a way that makes sense; for this I'll be explaining pointers from the reference of C.

First let's cover some background on variables, definition is pretty straight forward. Below you can see how to create an int variable.

int  x;

So let's get into the thick of it, what is a pointer? A pointer is a variable to a memory location! Wait, that's it? Is it possible that I am misconstruing what a pointer is? Not at all, instead the confusion comes when you begin explaining how to USE them. So let's see how to define a int pointer.

int  *x;

Yes, that is it, I've defined a pointer; the variable is interpreted as such; x is a variable that contains a memory location to an int. But wait, that cannot be it; people tell me that pointers are really difficult to use. Yes, it is a little bit harder, see you have to dereference the variable x to get the integer that is pointed to by the memory location stored in x. Ok, so now your eyes have glazed over right? You've decided to continue on to some other blog and see if someone else can explain it better but STOP and take a deep breath.

Let's look at a better example from our pointer variable x. Let's assume that we have a section of memory that is the memory representation of an int (4 bytes long).

 ---------------
| 4 bytes, woot |
 ---------------

Let's also put some byte numbers to this.

 ---------------
| 4 bytes, woot |
 ---------------
^ 0x1       0x5 ^

Ok, ok so what about our int *x? What the hell does that even mean and what is contained in there? Well, are you ready? It contains the address of our int (the 4 bytes above). So wait, x = 0x1? That's right! But now you have to "dereference" x. So bare with me; you have x which contains the memory location of our int. We need to get THE int. Therefore we need to get the data AT the memory location right? HA, that is dereferencing. So how do we do it?

*x=10;

No, it cannot be that simple... Yes, actually it is! The * tells the compiler to treat the variable as a pointer and to use the current variable as a memory location. So how do we know that it is an int? Aha! Because we defined it as an "int *" which means that inherently when we dereference x, *x must be an int. So now that we understand that the * indicates a reference and is used to dereference a pointer; what the hell is that ampersand '&' used for?

Well, the ampersand is used to indicate "reference of." Wait, I thought that was what * was used for. No, I said * was to indicate a variable is a reference or is used to dereference a reference. The & is used to say the memory location (reference) of a variable. Let's see this from our original variable x (I shall refresh your memory).

int  x;

See how x is just a nice happy variable? Not a pointer! So let's look at some more memory locations. It is important to note that since x is not a pointer, the memory layout below IS the x.

 -------------------
| 4 bytes, oh noes! |
 -------------------
^ x IS here

So when you specify &x, what are you saying? You're getting the memory location OF x! WHAT!?!?! That doesn't make any sense! how can you GET the memory location; yea I know you want to switch blogs again STOP and let's take a deep breath.

Remember that x IS a variable and therefore the variable MUST lie in a memory location. So when you do &x you are merely getting the memory location of that variable.

 -------------------
| 4 bytes, oh noes! |
 -------------------
^ 0x1           0x5 ^

So why would I want to do this? Because in order to create a pointer, it must reference a valid memory location. Check this out.

int  x;
int  *y;
y = &x;

OH MY GOD, MY HEAD EXPLODED! Nah, it's not that bad; until of course I tell you that *y and x are the same variable. If you don't understand that, then check it out, let's say that x is at location 0x1 (taking from our previous example). Then when we get the variable location 0x1 and store it into a variable that HOLDS the memory location (y), then when we dereference (there is that stupid ass word again!) *y we end up with an int at the location 0x1. OH SHIT, THAT IS THE LOCATION OF x!!!! GET SOME!

So what happens to x if I do *y = 22? Right, x = 22! Why is that? Because *y and x are THE SAME VARIABLE! I'll say it again if you want me to but I'm sure you're getting tire of me explaining it; so I'll say this if basic pointers don't make sense, you should start reading this over again (if you don't understand iteration, you will by the end of this).

Fantastic, so now we understand the basics of pointers. But what about those stupid ->'s and when can I use .'s? Ok, we're getting ahead of ourselves. So if you know about structs (if you're doing anything in C I'm making the assumption you've heard of them) but if not let's look at an example.

typedef struct str{
  int x;
}str;

So here we have a basic struct; essentially it's like a class with only public members (and no methods) in Java. If you don't know what a class or a struct is; you shouldn't be messing around with pointers. So let's look at referencing the variable x inside of an example str.

str s;
s.x = 10;

Well that wasn't too bad; you use the . operator to indicate a member of the str struct. But here is the thing; let's say that you wanted s to be a pointer. Why? Well, we'll get to that; but right now just follow (I haven't steered you wrong yet!). Let's assume that our variable s is declared as below.

str *s;

Then we want to access x. Ok, simple right, s.x; no... s->x; oh holy crap you're confusing me! Nah, it's really not that bad, s was a pointer right? So what made you think that you could just access x as if s was the variable!? Oh I get it, s is a pointer, which means we had to dereference it to get x. So that means that -> dereferences s and gets the variable x from the struct! Right! if you wanted to write it (and seem cool) you could do (*s).x but no one would hire you because that would be horrible to read everyday.

So wait, that's all the -> operator means? Yes! See, I told you pointers weren't that bad. But of course, there is the big question; how do I actually USE pointers? Dynamic array sizes. What does that have to do with anything? Easy, the function malloc() which allows you allocate a section of memory. Let's say I want to get an array of 2 integers.

int * x;
x = malloc(sizeof(int) * 2);

If you don't know malloc, you'll need to go check out the documentation but essentially we are just allocating the size of two integers (8 bytes) in memory and getting the memory location. Storing it into a pointer. What does this look like in memory?

 ------------------
| 8 bytes, badness |
 ------------------
^ 0x1          0x9 ^

And of course x holds 0x1. Now, since we allocated 2 ints it is important to note that THERE ARE TWO OF THEM. So how do we get them? i[0] references the first i[1] indicates the second. WAIT, huh? Ok, so i[0] dereferences i and then goes to an offset of 0. i[1] dereferences i and then goes to an offset of 1. So where does this offset come from? Easy, the offset is defined by the SIZE OF THE VARIABLE TYPE! Since our variable type is int, the size is 4 bytes; so i[1] will go to 0x1 + 4 = 0x5.

So can I name another reason to use them? Sure! If you wanted to modify a variable without having to return it. What? That makes no sense. Sure it does, let's say we want to take in a str variable (again stealing the definition from above).

void modify_str(str * s) {
  s->x = 10;
}

OH SHIT! That makes sense, yes! Is that awesome or what!? So here is the cool part; returning pointers. Example, let's say that we want to create a new str, and set the internal x to 1983 (my birth year because I'm picking random numbers).

str *new_str(int  i) {
  str *s = malloc(sizeof(str));
  s->x = i;
  return s;
}

OH DAMN! This makes initializations work! All of this is starting to make sense; why we would need to use pointers, how pointers work, and also how to effectively use pointers. Yes, and it really isn't as bad as many people tend to think. So here is the thing, pointers are pretty straight forward; double pointers and higher become much more abstract.

DISCLAIMER
I am not going to cover double pointers; but since you understand pointers now; just think of them as pointers to pointers (since that is what they actually are). And more importantly, you really won't use double pointers until you start needing to do pointer arithmetic in order to do iterations.

I'll say that I'll cover double pointers at a later time; but I have no guarantees that it will happen. If it does, hopefully I will be as fun and exciting as I have been so far! Just remember, pointers store memory locations of variables.

Wednesday, May 18, 2011

How can you not use <insert favorite IDE here>?!

This is one of those subjects that really bugs me is when people promote an IDE to the point that they are insisting that you are a complete failure for not using that IDE. So I realize that people always have an IDE that they really like. Mine is Eclipse for Java development; however, for C based applications I prefer to write in a text editor (vim is my poison). I have my own reasons, which I will outline below, for why I like these two setups.

Eclipse I like for Java style development, it has some really great auto-complete. However, I also like the simplicity for adding external jars and linking projects together as dependencies. Also there are quite a few good plugins for things like Ivy and Subversion that give tons of extra functionality. Why not Netbeans? I've never been impressed with the auto-complete and the auto-populate for import statements. Also I hate the fact that most class names will be fully qualified which bugs me (this may have been changed in a more recent version but this is how it was in a previous version).

Vim I like for C based applications because I've never found that any IDE that gets out of the way and has a good built in debugger. Frankly when I'm writing in C I really don't need that much help since with Vim I get CTags. This gives me the ability to investigate a function on the fly as well as investigating structs.

I realize that I am putting forth these statements and it sounds like I am breaking the rule that I put forward. Well not quite, I'm not stating that anyone else should work the same way that I work. Instead I am stating the reason(s) why I am doing something. The thing that I cannot stand is when I say "I use Eclipse for Java" and someone else goes "you should really use NetBeans, it's the awesomest and you are an idiot for not using it." I've also heard the argument "well you don't use an IDE anyway (for C development) so why not use something?" Why not? Because I'm more effective in vim, I have very few problems within my type safety so I don't need an IDE checking my type safety everytime I finish a word.

So what exactly is my point? Well, I'm not saying that you shouldn't propose someone using an IDE that you are used to; instead I'm stating that you can propose an IDE, just don't push that idea to death. It annoys me when I am working in a specific environment and doing well with it; and someone comes along and goes "well boys, we're going to change things up and go with this IDE, everyone should now be using it." Get off my back, I do more work in 2 days than you do in a week; maybe you should adopt my environment because it seems to work. Oh wait, you have problems working in my environment? Then why would you push me into an environment that I might have problems working in?

Sunday, April 24, 2011

How much is too much dereferencing?

This is mainly focused in any language where you can have composite classes or structures. Just to explain this, where you have classes inside of classes or structs inside of struct. Consider the following.

C Example with single inside struct


struct inside {
  int i;
};
struct outside {
  struct inside i;
};
int main(int argc, char *argv[]) {
  struct outside o;
  o.i.i = 10;
  return 0;
}

Does this really make sense when we look at

o.i.i

? If it does, let's look if we changed inside like the following.

C Example with deeper inside struct


struct deeperinside {
  int i;
};
struct inside {
  struct deeperinside i;
};

Now, we call the internal i as o.i.i.i which becomes pretty unruly. Obviously using the same variable name each time would be pretty terrible. But even changing the name doesn't always help. Let's see what happens if each element is actually a pointer.

C Example with deeper inside struct pointers


struct deeperinside {
  int i;
};
struct inside {
  struct deeperinside *i;
};
struct outside {
  struct inside *i;
};
...
struct outside *o;
...

So now we reference the int i from deeperinside as o->i->i->i This is completely unreadable, so let's see what happens if we at least add some casting.

C Example Access to deepinside variable


((struct deeeperinside *((struct inside *)(o->i))->i))->i

Although at least you can understand this, it becomes completely unreadable; so what is the answer here? How about we create some temp variables that can be used to store our inner variables?

C Example Access to deepinside variable through temp variables


struct inside *tempinside = o->i;
struct deeperinside *tempdeeperinside = tempinside->i;
tempdeeperinside->i

Notice that this becomes much more readable and understandable. This allows the people who will be maintaining the code much happier and faster to find/fix issues as they may arise. Now I've been focusing mainly in C but this also applies to languages such as Java. Why Java? Because A) it's the language that many programmers are programming in today; and B) it's much easier to encapsulate classes inside of classes (and is usually encouraged). By bringing up the idea of how we deal with accessors to these inside classes we can ensure that people at least understand how they can better display them to make them more manageable.

Another option in an actual OO language would be to encapsulate those objects, this in and of itself means that those o.i.i.i would turn into o.getI().getI().getI() which again makes no sense; are those composite classes (classes that encapsulate a class of the same type)?

Remember, making code reusable is good; making code understandable is much more important!

Friday, April 15, 2011

Do you like programming or YOUR programming?

This is a very interesting question that I've seen come up again and again; although I've never know the best way to express it until now. Out of all the languages listed below which ones can you not stand? If presented with that specific language to keep up; would you do it or would you quit? The question is, do you like Programming or do you like YOUR programming?

Perl


sub Example($) {
  print $_[0] . "\n";
}

Lisp


(defun Example (var)
  (print var))

C/C++


void Example(char *var) {
  printf("%s\n", var);
}

Java


class Temp {
  public void Example(String s) {
    System.out.println(s + "\n");
  }
}

If you looked at any of these languages and you said to yourself "I'd never program in that language" then you've just answered the question. Many of us developers are presented with a code base and are asked to make changes or to fix it. We normally don't have the ability to tell the client "well I can change it to language X in about 5 weeks if you'd like" because the client will say "what do I gain from this?" The answer is; well you get nothing extra but it will be easier to maintain. This then presents you with a big no because the client isn't going to pay you to rewrite an already running application into a language that you want (mainly because someone else will come along with this same statement).

We all have languages that we like; we all have languages that we're good at. This is true, but being scared of a specific language to the point that you would leave a job instead of program in it is just ludicrous. My point is not that you should seek out work in languages that you are not familiar with; but instead be open to working in different languages. The first step to working in an unfamiliar language is to find out how to do things that you would normally do in your known languages. NOTE: I do not condone going out and buying a book on the language and trying to cram it all in! Instead pick up on new ways of doing things as you are programming.

One of the big pitfalls that I've seen is someone coming into a team with programming knowledge. They then try to apply their previous programming experiences to a language that you might not know. Because this is the case, always listen to your teammates on how to better work in the language. Also, if you believe that your way of programming is getting ridiculed and you feel it is better than the way being suggested you should stand up for your way. Now be open to change and listen to why someone is making the argument that your way is bad/inefficient etc... Make sure to get them to explain why their way is better (and yes understandability is a feasible reason for example a .foreach() method rather than for(int...) {}).

So why bring this up? I have a coworker who is a DBA in MySQL who is going to be leaving. They have always been a huge proponent of DB2; always trying to get the entire organization to DB2. Now, the company I work for is not a small 50 person group; I work for a medium 500 person who just got bought out by a 100k person corporation. The problem with pushing these types of changes is that once you grow so large the company is not going to do a massive huge change just because someone feels that they can better manage a DB2 instance. This would require every application we use to be changed from MySQL to DB2 connectors which would be quite a large problem since almost every application we have developed is in C. We already have MySQL libraries that are well tested and work well; so having to switch to DB2 would be a huge change that the company was not wanting to risk.

So now my coworker is leaving because they didn't get their way. So how does this relate to my original example? Well think about this; it's not just dealing in programming where you shouldn't be afraid of working with new languages. Instead, you should always be open to work in new applications and expand your horizons. You come from a completely Windows shop and get hired as a Linux administrator. Don't be afraid, pick up that Linux administration book and install a copy at home. Play with it, break it, fix it, learn it. The way I see it, we as technology professionals should always be willing and eager to learn about new features in the technology field and not just limit our learning to ONLY our specific field. That's right, if your a Linux admin go learn something about networking or Windows administration. If you're a developer, go learn something about networking or host administration. If you're a DBA who works with DB2, go pick up a copy of Oracle or MySQL and see how they compare; maybe you'll find something interesting from that other piece of software that you would like better.

Thursday, April 14, 2011

Unit testing the proper way

Many times I've seen Unit tests that really don't help out. You will often find Unit tests that will quite often be like

public void TestNullInput() {

  MyObject obj = new MyObject();

  obj.functionThatShouldNotFail(null);

}

This really doesn't help that much; now it does need to be done especially in languages such as C or C++. You must always check what happens if you pass a function a NULL pointer. However, many times we see these that people will put in Unit tests for the most obvious tests. But what happens what you end up with the more non-obvious tests? Look at this example below and try to figure out what we're testing for.

public void TestInput() {

  MyString str = new MyString();

  str.parse("Hello|world!"); 

  Assert.assertEquals(str.parsedCharCount(), 

    "Hello|world!".length());

}

Well, it's really not that obvious at first; we just think oh why would we care about checking for the parsed amount of letters. Well here's the thing; let's assume that we're doing parsing and we need to do something special when we hit the '|' character. Let's assume that we are skipping it or something to that effect; why would we be looking for 12? We if we think about C/C++ (not so much managed languages but we'll assume that it might happen in their case as well) then it is very easy to run off the end of the array. So what we do is to count the number of characters that we are parsing; this allows us to check and ensure that we ONLY parsed the number of characters that was possible in the string (the max length of the input string).

The idea isn't just to throw out tests like this in every possible case before declaring that your code is ready. The point of bringing this up is that when you come across an error or a segmentation fault etc... there is always a way to write a Unit test to check for an error. Unit tests provide a way to detect bugs that may have regressed; and because we can detect them when they regress it can provide us a way to fix problems that we may have resent out to the consumer.

Thursday, February 24, 2011

Developing scared software

This is something that I've begin to notice and have a real problem with. Now, it may not be you but you know SOMEONE that has done this. A problem/issue arises or there is a deprecated library call that is being used. It is something that has been working for quite sometime but might be either holding back new features because it is no longer supported; or it's not broken enough to force someone to fix it. This is where the people come in and say "well this needs to be fixed" but they never make any changes to it.

Sometimes this is accompanied by a "it's in a library" or is followed up with "we'll fix it at some point in the future when we have to." So why is this an issue with me? Easy, it's a lame excuse; if you see that there is a problem YOU, as the developer, should be man enough to step up and fix the issue that is starting at you. Yes, I do understand that there are processes and bugs that need to be prioritized; but if you see that it is an issue then you should make the case to find time to fix it.

I've seen this from a few different developers and it kills me when I hear it. "Oh yea, I'd love to replace that." Really? If you would love to do it, then do it. Especially if you see the problem and you know what the acceptable fix is; this means that you've already thought out a plan to fix the issue and as such you should put your fingers to work fixing the issue. Your users may not thank you, but your peers will once they see that you're working to make the product easier (hopefully with your fix) to maintain.

The reason that I call this scared software is that the software is coded, almost on eggshells, in a way that no one really wants to make big changes to it. Thus, to get a new feature (or to perform a bug fix) many people will end up writing tons more code to work around the issues that no one wants to fix. The solution is simple; if you see an issue or see an improvement, do it.

Tuesday, February 22, 2011

Bugs are a part of life; and users will hate you regardless

Something that I consistently have issues with is the support (or lack there of) for development groups by the end users. It is consistent that I hear complaints about products, and of course I'm not just speaking directly towards MY product, that it's "unstable" or "unusable" or that it is a "worthless piece of garbage." Many of these I've heard for software such as Microsoft Windows; or some other piece of software that has a tendency to crash. The product that I am currently working on is a fairly stable product although it has its own issues. Now I can understand and even empathize with crashes that can occur from day-to-day usage of software. People bring up applications such as Apache, to which I pleasantly remind them that Apache does a fork() and creates new processes constantly to handle requests. If a fork crashes, it continues on its merry way; with a GUI based application there really isn't a whole lot to fall back on which is why you see "death" if something bad occurs.

So what do I mean when I say that I empathize? It means, I'm sorry you found an issue and I realize that it is a problem. If you can put in a bug within our bug tracking system, I will be more than happy to track it down and fix it in a future build. Now this ends up with 3 scenarios; the first is the happy scenario where the user creates the bug and you fix it and they are happy. The 2nd scenario is the more likely; something else more pressing comes up (maybe they never crashed in the same manner again, but some other fire is burning and you need to throw some water on it). At which point the user comes back 3 months later and goes, "what's up with this bug? You asked me to put it in and you haven't fixed it."

Now you could have the best come back ever with "well, an issue was found that was deleting everything from our data store and I had to work to get that fixed which cropped up 10 other issues and in each case again stuff was being deleted from random data stores." Your user will still complain that you have not fixed the issue that THEY put in. I've tried this in multiple ways where instead I would create the bug; but that just seems to exacerbate the issue because the user will come back and complain that you didn't enter the information correctly and that the issue THEY saw was not the issue that you KNOW is there.

Now there is the 3rd scenario which is also more likely than the 1st where you fix the issue. Let's say that there is another 2 week ~ 2 months before a release and the user comes back and asks you if the issue is fixed; you will then explain that it is scheduled to be released. The user then walks away feeling cheated because it has been "fixed" but yet they see NO output; only that their bug has been "fixed." The user, from that point on, begins to get weary about submitting bugs since they don't actually see these fixes and by the time there is a release they have already forgotten about it. This is the one that I see the most; and as time goes on, the users usually become disappointed with the software since they consistently see bugs but don't really get instant gratification that the bug was fixed.

It doesn't matter how many bugs you fix; or how difficult the bugs are that you have fixed. You will always be known as the person who fixes the stuff that they broke to begin with. The only real way to help you to save the relationships with the clients is to make sure you are talking with your end users as often as you can. Finding out how the end users are working with the software, determining whether they are having issues with the software and getting feedback. The best way to determine how usable an application is, is to actually use the software. This is extremely difficult in many circumstances if the software that you are writing requires a high skill level to effectively use. The consumers of our product are GCIA certificate Security Analysts; it would be difficult for every developer (who did not have a security background) to actually go and analyze traffic logs.

The other side is to hide bug tracking as much as possible; there are always complaints about the way the software works; but things like crashes should have automated reporting. This shields the consumers from actually having to create the bug and become invested into a bug (whereby they will care about the state of that specific bug) and thus have no expectations of when that bug might be fixed. You can see this during crash reporting in Windows or KDE where we see the crash reporter popup asking if we want to submit the crash report to Microsoft or KDE respectively. This allows the developers to look into the issues from a completely object standpoint rather than having a bug which was written by an angry user who had been writing a document for a client which then crashed right before they saved which says

Hey, your crap software crashed when I was writing my document; thanks for wasting 2 hours of my time! Fix it!

Saturday, February 5, 2011

Development Within a VM

I know what you're going to say, this post is so lame because we all know that virtualization is the only way to develop. But actually this is not what I am going to cover. Instead, I'm going to cover actually doing development within a VM environment. Doing the coding and building within that VM environment.

So here is the question; why would I want to do the actual development within a VM? The obvious draw to VM's is to allow a developer to simulate multiple machines at the same time which can then be reverted to a previously working state. So why do the development in a VM? Well pretty much for the same reason that one would do testing and QA in a VM environment.

For example, let us assume that we have a 1GB VM in Linux (mainly because this is what I program in). Let's also assume that I keep my E-Mail and IM in my underlying OS (let's assume that this is Windows) because I love the clients. So the question is why would I do this? The answer is pretty subtle; if you think about what your IDE and compiler can do for you and also knowing that there are tons of issues that you can experience with both of these. For example, if you do an upgrade you aren't going to be rolling back (mainly because this isn't an option) assuming that possibly an upgrade broke your library.

Let's take another example though; what would happen if your IDE went nuts and began hanging (let's assume that it's trying to download your entire SVN code repository or that it's trying to download every bug you've ever worked on). Or maybe your bug tracking software runs extremely slowly; this causes your IDE to hang everytime it goes to update it's local bug store. Well, if you have your development environment contained within a VM, there is no need to worry that all is lost. In my instance, I would still have access to my E-Mail/IM and would not be completely useless until my IDE responded.

Currently I am running the new VirtualBox, and it has fantastic implementations within Linux and Windows and works pretty well. I'm going to be doing development within VM's for a while and will be doing it at work as well. Now, if you have a separate box that you connect to for your development needs; then this really doesn't apply to you. But it is pretty close to what using a VM for development is.

Code Monkey

My Pages