Code Monkey

Sunday, April 24, 2011

How much is too much dereferencing?

This is mainly focused in any language where you can have composite classes or structures. Just to explain this, where you have classes inside of classes or structs inside of struct. Consider the following.

C Example with single inside struct


struct inside {
  int i;
};
struct outside {
  struct inside i;
};
int main(int argc, char *argv[]) {
  struct outside o;
  o.i.i = 10;
  return 0;
}

Does this really make sense when we look at

o.i.i

? If it does, let's look if we changed inside like the following.

C Example with deeper inside struct


struct deeperinside {
  int i;
};
struct inside {
  struct deeperinside i;
};

Now, we call the internal i as o.i.i.i which becomes pretty unruly. Obviously using the same variable name each time would be pretty terrible. But even changing the name doesn't always help. Let's see what happens if each element is actually a pointer.

C Example with deeper inside struct pointers


struct deeperinside {
  int i;
};
struct inside {
  struct deeperinside *i;
};
struct outside {
  struct inside *i;
};
...
struct outside *o;
...

So now we reference the int i from deeperinside as o->i->i->i This is completely unreadable, so let's see what happens if we at least add some casting.

C Example Access to deepinside variable


((struct deeeperinside *((struct inside *)(o->i))->i))->i

Although at least you can understand this, it becomes completely unreadable; so what is the answer here? How about we create some temp variables that can be used to store our inner variables?

C Example Access to deepinside variable through temp variables


struct inside *tempinside = o->i;
struct deeperinside *tempdeeperinside = tempinside->i;
tempdeeperinside->i

Notice that this becomes much more readable and understandable. This allows the people who will be maintaining the code much happier and faster to find/fix issues as they may arise. Now I've been focusing mainly in C but this also applies to languages such as Java. Why Java? Because A) it's the language that many programmers are programming in today; and B) it's much easier to encapsulate classes inside of classes (and is usually encouraged). By bringing up the idea of how we deal with accessors to these inside classes we can ensure that people at least understand how they can better display them to make them more manageable.

Another option in an actual OO language would be to encapsulate those objects, this in and of itself means that those o.i.i.i would turn into o.getI().getI().getI() which again makes no sense; are those composite classes (classes that encapsulate a class of the same type)?

Remember, making code reusable is good; making code understandable is much more important!

Friday, April 15, 2011

Do you like programming or YOUR programming?

This is a very interesting question that I've seen come up again and again; although I've never know the best way to express it until now. Out of all the languages listed below which ones can you not stand? If presented with that specific language to keep up; would you do it or would you quit? The question is, do you like Programming or do you like YOUR programming?

Perl


sub Example($) {
  print $_[0] . "\n";
}

Lisp


(defun Example (var)
  (print var))

C/C++


void Example(char *var) {
  printf("%s\n", var);
}

Java


class Temp {
  public void Example(String s) {
    System.out.println(s + "\n");
  }
}

If you looked at any of these languages and you said to yourself "I'd never program in that language" then you've just answered the question. Many of us developers are presented with a code base and are asked to make changes or to fix it. We normally don't have the ability to tell the client "well I can change it to language X in about 5 weeks if you'd like" because the client will say "what do I gain from this?" The answer is; well you get nothing extra but it will be easier to maintain. This then presents you with a big no because the client isn't going to pay you to rewrite an already running application into a language that you want (mainly because someone else will come along with this same statement).

We all have languages that we like; we all have languages that we're good at. This is true, but being scared of a specific language to the point that you would leave a job instead of program in it is just ludicrous. My point is not that you should seek out work in languages that you are not familiar with; but instead be open to working in different languages. The first step to working in an unfamiliar language is to find out how to do things that you would normally do in your known languages. NOTE: I do not condone going out and buying a book on the language and trying to cram it all in! Instead pick up on new ways of doing things as you are programming.

One of the big pitfalls that I've seen is someone coming into a team with programming knowledge. They then try to apply their previous programming experiences to a language that you might not know. Because this is the case, always listen to your teammates on how to better work in the language. Also, if you believe that your way of programming is getting ridiculed and you feel it is better than the way being suggested you should stand up for your way. Now be open to change and listen to why someone is making the argument that your way is bad/inefficient etc... Make sure to get them to explain why their way is better (and yes understandability is a feasible reason for example a .foreach() method rather than for(int...) {}).

So why bring this up? I have a coworker who is a DBA in MySQL who is going to be leaving. They have always been a huge proponent of DB2; always trying to get the entire organization to DB2. Now, the company I work for is not a small 50 person group; I work for a medium 500 person who just got bought out by a 100k person corporation. The problem with pushing these types of changes is that once you grow so large the company is not going to do a massive huge change just because someone feels that they can better manage a DB2 instance. This would require every application we use to be changed from MySQL to DB2 connectors which would be quite a large problem since almost every application we have developed is in C. We already have MySQL libraries that are well tested and work well; so having to switch to DB2 would be a huge change that the company was not wanting to risk.

So now my coworker is leaving because they didn't get their way. So how does this relate to my original example? Well think about this; it's not just dealing in programming where you shouldn't be afraid of working with new languages. Instead, you should always be open to work in new applications and expand your horizons. You come from a completely Windows shop and get hired as a Linux administrator. Don't be afraid, pick up that Linux administration book and install a copy at home. Play with it, break it, fix it, learn it. The way I see it, we as technology professionals should always be willing and eager to learn about new features in the technology field and not just limit our learning to ONLY our specific field. That's right, if your a Linux admin go learn something about networking or Windows administration. If you're a developer, go learn something about networking or host administration. If you're a DBA who works with DB2, go pick up a copy of Oracle or MySQL and see how they compare; maybe you'll find something interesting from that other piece of software that you would like better.

Thursday, April 14, 2011

Unit testing the proper way

Many times I've seen Unit tests that really don't help out. You will often find Unit tests that will quite often be like

public void TestNullInput() {

  MyObject obj = new MyObject();

  obj.functionThatShouldNotFail(null);

}

This really doesn't help that much; now it does need to be done especially in languages such as C or C++. You must always check what happens if you pass a function a NULL pointer. However, many times we see these that people will put in Unit tests for the most obvious tests. But what happens what you end up with the more non-obvious tests? Look at this example below and try to figure out what we're testing for.

public void TestInput() {

  MyString str = new MyString();

  str.parse("Hello|world!"); 

  Assert.assertEquals(str.parsedCharCount(), 

    "Hello|world!".length());

}

Well, it's really not that obvious at first; we just think oh why would we care about checking for the parsed amount of letters. Well here's the thing; let's assume that we're doing parsing and we need to do something special when we hit the '|' character. Let's assume that we are skipping it or something to that effect; why would we be looking for 12? We if we think about C/C++ (not so much managed languages but we'll assume that it might happen in their case as well) then it is very easy to run off the end of the array. So what we do is to count the number of characters that we are parsing; this allows us to check and ensure that we ONLY parsed the number of characters that was possible in the string (the max length of the input string).

The idea isn't just to throw out tests like this in every possible case before declaring that your code is ready. The point of bringing this up is that when you come across an error or a segmentation fault etc... there is always a way to write a Unit test to check for an error. Unit tests provide a way to detect bugs that may have regressed; and because we can detect them when they regress it can provide us a way to fix problems that we may have resent out to the consumer.

Thursday, February 24, 2011

Developing scared software

This is something that I've begin to notice and have a real problem with. Now, it may not be you but you know SOMEONE that has done this. A problem/issue arises or there is a deprecated library call that is being used. It is something that has been working for quite sometime but might be either holding back new features because it is no longer supported; or it's not broken enough to force someone to fix it. This is where the people come in and say "well this needs to be fixed" but they never make any changes to it.

Sometimes this is accompanied by a "it's in a library" or is followed up with "we'll fix it at some point in the future when we have to." So why is this an issue with me? Easy, it's a lame excuse; if you see that there is a problem YOU, as the developer, should be man enough to step up and fix the issue that is starting at you. Yes, I do understand that there are processes and bugs that need to be prioritized; but if you see that it is an issue then you should make the case to find time to fix it.

I've seen this from a few different developers and it kills me when I hear it. "Oh yea, I'd love to replace that." Really? If you would love to do it, then do it. Especially if you see the problem and you know what the acceptable fix is; this means that you've already thought out a plan to fix the issue and as such you should put your fingers to work fixing the issue. Your users may not thank you, but your peers will once they see that you're working to make the product easier (hopefully with your fix) to maintain.

The reason that I call this scared software is that the software is coded, almost on eggshells, in a way that no one really wants to make big changes to it. Thus, to get a new feature (or to perform a bug fix) many people will end up writing tons more code to work around the issues that no one wants to fix. The solution is simple; if you see an issue or see an improvement, do it.

Tuesday, February 22, 2011

Bugs are a part of life; and users will hate you regardless

Something that I consistently have issues with is the support (or lack there of) for development groups by the end users. It is consistent that I hear complaints about products, and of course I'm not just speaking directly towards MY product, that it's "unstable" or "unusable" or that it is a "worthless piece of garbage." Many of these I've heard for software such as Microsoft Windows; or some other piece of software that has a tendency to crash. The product that I am currently working on is a fairly stable product although it has its own issues. Now I can understand and even empathize with crashes that can occur from day-to-day usage of software. People bring up applications such as Apache, to which I pleasantly remind them that Apache does a fork() and creates new processes constantly to handle requests. If a fork crashes, it continues on its merry way; with a GUI based application there really isn't a whole lot to fall back on which is why you see "death" if something bad occurs.

So what do I mean when I say that I empathize? It means, I'm sorry you found an issue and I realize that it is a problem. If you can put in a bug within our bug tracking system, I will be more than happy to track it down and fix it in a future build. Now this ends up with 3 scenarios; the first is the happy scenario where the user creates the bug and you fix it and they are happy. The 2nd scenario is the more likely; something else more pressing comes up (maybe they never crashed in the same manner again, but some other fire is burning and you need to throw some water on it). At which point the user comes back 3 months later and goes, "what's up with this bug? You asked me to put it in and you haven't fixed it."

Now you could have the best come back ever with "well, an issue was found that was deleting everything from our data store and I had to work to get that fixed which cropped up 10 other issues and in each case again stuff was being deleted from random data stores." Your user will still complain that you have not fixed the issue that THEY put in. I've tried this in multiple ways where instead I would create the bug; but that just seems to exacerbate the issue because the user will come back and complain that you didn't enter the information correctly and that the issue THEY saw was not the issue that you KNOW is there.

Now there is the 3rd scenario which is also more likely than the 1st where you fix the issue. Let's say that there is another 2 week ~ 2 months before a release and the user comes back and asks you if the issue is fixed; you will then explain that it is scheduled to be released. The user then walks away feeling cheated because it has been "fixed" but yet they see NO output; only that their bug has been "fixed." The user, from that point on, begins to get weary about submitting bugs since they don't actually see these fixes and by the time there is a release they have already forgotten about it. This is the one that I see the most; and as time goes on, the users usually become disappointed with the software since they consistently see bugs but don't really get instant gratification that the bug was fixed.

It doesn't matter how many bugs you fix; or how difficult the bugs are that you have fixed. You will always be known as the person who fixes the stuff that they broke to begin with. The only real way to help you to save the relationships with the clients is to make sure you are talking with your end users as often as you can. Finding out how the end users are working with the software, determining whether they are having issues with the software and getting feedback. The best way to determine how usable an application is, is to actually use the software. This is extremely difficult in many circumstances if the software that you are writing requires a high skill level to effectively use. The consumers of our product are GCIA certificate Security Analysts; it would be difficult for every developer (who did not have a security background) to actually go and analyze traffic logs.

The other side is to hide bug tracking as much as possible; there are always complaints about the way the software works; but things like crashes should have automated reporting. This shields the consumers from actually having to create the bug and become invested into a bug (whereby they will care about the state of that specific bug) and thus have no expectations of when that bug might be fixed. You can see this during crash reporting in Windows or KDE where we see the crash reporter popup asking if we want to submit the crash report to Microsoft or KDE respectively. This allows the developers to look into the issues from a completely object standpoint rather than having a bug which was written by an angry user who had been writing a document for a client which then crashed right before they saved which says

Hey, your crap software crashed when I was writing my document; thanks for wasting 2 hours of my time! Fix it!

Saturday, February 5, 2011

Development Within a VM

I know what you're going to say, this post is so lame because we all know that virtualization is the only way to develop. But actually this is not what I am going to cover. Instead, I'm going to cover actually doing development within a VM environment. Doing the coding and building within that VM environment.

So here is the question; why would I want to do the actual development within a VM? The obvious draw to VM's is to allow a developer to simulate multiple machines at the same time which can then be reverted to a previously working state. So why do the development in a VM? Well pretty much for the same reason that one would do testing and QA in a VM environment.

For example, let us assume that we have a 1GB VM in Linux (mainly because this is what I program in). Let's also assume that I keep my E-Mail and IM in my underlying OS (let's assume that this is Windows) because I love the clients. So the question is why would I do this? The answer is pretty subtle; if you think about what your IDE and compiler can do for you and also knowing that there are tons of issues that you can experience with both of these. For example, if you do an upgrade you aren't going to be rolling back (mainly because this isn't an option) assuming that possibly an upgrade broke your library.

Let's take another example though; what would happen if your IDE went nuts and began hanging (let's assume that it's trying to download your entire SVN code repository or that it's trying to download every bug you've ever worked on). Or maybe your bug tracking software runs extremely slowly; this causes your IDE to hang everytime it goes to update it's local bug store. Well, if you have your development environment contained within a VM, there is no need to worry that all is lost. In my instance, I would still have access to my E-Mail/IM and would not be completely useless until my IDE responded.

Currently I am running the new VirtualBox, and it has fantastic implementations within Linux and Windows and works pretty well. I'm going to be doing development within VM's for a while and will be doing it at work as well. Now, if you have a separate box that you connect to for your development needs; then this really doesn't apply to you. But it is pretty close to what using a VM for development is.

Wednesday, November 24, 2010

Dynamic versus Static Scoping

Dynamic and static scoping both have specific places in programming. We're going to discuss what the difference is and why the differences can make a difference in your program.

What is scoping
When we are talking about scoping, we are mainly concerned about the bindings of variables. In other words, where they have been defined. In some languages such as Perl there is no actual need to define a variable but that is a discussion for a different day. Regardless, bindings of variables show that they are variables and of what type they are. Below you can see the bindings of the variable x and variable y as being an integer and float respectively.

int x;
float y;

x = 10;
y = 3.14;

When we begin talking about scoping, we are specifically talking about scoping within higher order functions, closures, and Lambda Functions. Why? Because this is where scoping really matters within the variables. For example, in a language such as C, there is no purpose to talk about static versus dynamic scoping because everything is static. It will make more sense once we begin talking about it. Essentially variable bindings decide what variable a specific variable is talking about. For example, we KNOW that x is referring to int x.

Static Scoping
This is the scoping style that most people are used to and is the standard. So let's look at an explanation; essentially, the variable that is free (so during a closure or higher order function) is bound to the closest variable when the function is defined. An example is shown below.

var x=20;
var newfunc = f();
newfunc();

function f() {
var x = 10;
return function s() { print x; };
}

So, what prints out? You may or may not be surprised to find out that "10" will be printed out. Why? Because the x that is bound is the x from within f(); this is because the variable is bound statically and usually at compile time.

Dynamic Scoping
This is the scoping style that most people are not used to and can sometimes provide really strange results during programming. So let's look at an explanation; essentially, the variable that is free (so during a closure or higher order function) is bound to the closest variable in the activation records during run time. So what does this mean? The variable is bound to whatever the closest variable name is when the function is executed.

function d() {
  var x=20;
  var newfunc = f();
  newfunc();
}

var x=30;
d();

function f() {
  var x = 10;
  return function s() { print x; };
}

In the example shown above, one may be surprised to see that the output would be "20"; but why? the reason is quite surprising. The x is a free variable that must be bound at runtime; so when we actually execute newfunc() we look back through the activation records until we find a variable x that can be used. So we look at the AR for newfunc but find no x. We then check the AR for d and find that there is a local variable x defined as 20. This becomes the binding for our x. Now below you can see one more example of this.

function d() {
  var newfunc = f();
  newfunc();
}

function n() {
  var x=30;
  d();
}

n();

function f() {
  var x = 10;
  return function s() { print x; };
}

In this instance, the output changes again; now we see "30" as the output. Why? Let's step through it. We execute newfunc() and we see the free variable x that we need to bind. We then check the AR of newfunc but find no definition of x. We then look in the AR of d. Again, we see no definition of x so we continue up the AR's to n's definition. Here we now find an x which we then use which was defined as "30."

Let's get an example of both in Perl; static scoping is done via the my operator and the local operator provides dynamic scoping. Please note that by using the local operator, "use strict" will cause an error with using the variable and will say that the variable is not declared. Let's look at both examples.

Static
The output is "10", again, notice that the $x is bound to 10 from when the Lambda function is created.

sub f() {
my $x = 10;
return sub { print $x . "\n"; };
}

my $func = f();
my $x = 20;
&{$func}();

Dynamic
The output is "20", again, notice that the $x is bound to 20 from when the Lambda function is actually executed.

sub f() {
local $x = 10;
return sub { print $x . "\n"; };
}

my $func = f();
local $x = 20;
&{$func}();

Why is this useful?
Well... Let's be honest, essentially it's like passing a variable to a function. However, you could set it up so that you could dynamically change a function's internal variable if all variables were free dynamically scoped variables. This would allow for other functions to change the meanings of the functions without having to pass all the variables in. In my experience this is really not that useful. But knowing what the difference is and knowing how to tell which mode a variable is assigned to (in languages like Lisp and Perl that support both) is important. Or if you end up having to maintain a piece of code that uses dynamic scoping, you will know what exactly it means and the side effects of them.