My Pages

Thursday, May 22, 2014

The Next Java

One question that I've been asking myself over the past few years is "What is after Java?" The notion that Java, or any JVM language for that matter, is the go forward language forever is incorrect. Instead, I believe that Java is the new C; it is the primarily adopted language for developers across the world. Although many people continue to write new software in C, Java is definitely becoming a larger chunk of the market share. But to see where we are going, we should look backwards.

Decades ago, the big language was COBOL; up until the mid 90's I still knew of large corporations and governmental facilities that were using these old green screen systems. It was easy for the leaders to say "it still works, so why change it?" And they were right, hiring a C developer meant getting rid of all your old COBOL programmers or training them. The former being a very poor political stance and the latter being a very expensive one. So why did they eventually make the change? Because the COBOL programmers retired or learned new things and stopped being available to program in COBOL. So the prices went extremely high ($400/hr in mid-90's was a pretty good cost) for the talented and knowledgeable COBOL developers.

So too is beginning to happen with C developers, the vast majority of developers do not want to program in C. It is a very difficult language to debug and is extremely verbose. Memory leaks happen due to poor code over the years and tracking them down is tedious and time consuming. Newer developers want to develop in Java, primarily because of the Garbage Collection, built-in OO and numerous other niceties over C.

Did you notice how I left off C++? Well that was actually intentional. I don't really consider C++ as being something that all programmers converted to from C. Primarily because most developers didn't actually use C++ how it was intended to be used; they kept writing their C code and just compiled it in C++. They did this because "if someone wants to take advantage of the new OO features they can." But the vast majority of the developers didn't; they write their C style of code and went on about their business. In my opinion the reason that they didn't was because the language didn't enforce these concepts on the developer. There was no "Everything must be OO," because of this, there was no reason to go and learn these concepts.

So now let's take this concept and apply it to Java. Java 8 is out now, people are writing and releasing books on it. But let's not be hasty, I've been interested in how this will play out and I think that it will be the C++ of this generation. My feeling is based on how I've seen other developers react and the fact that Java 8 does not enforce any of the concepts. Given this, let's look at some comparisons.

Native vs. JVM Languages

Native JVM
C Pre Java 8
C++ Java 8
Ruby Groovy
LISP Clojure
Erlang/Haskell Scala

Languages Released

So let's look at the C programming languages (C and C++) and when they were released.

C 1972
C++ 1983

Now let's look at the Java programming languages (Let's say Java 5 which was where major adoption started and Java 8) and when they were released.

Pre Java 8 2004
Java 8 2014

Maybe this correlation is a little premature but notice that C++ came out 11 years after C, and Java 8 came out 10 years after Java. Let's take that Java 1.0 was released in 1996 which puts it 13 years after the C programming language. If we assume that history repeats then we can assume that in 2026 the new programming language will be released. So at this point we should all start looking out for the new programming language that will eventually overtake Java.

The Next Java

  • Similar syntax to Java and C
  • Object Oriented
  • Garbage Collection as a first class concept (rather than a threaded component)
  • Generics as a first class concept (no type erasure like Java has now)
  • Uses the Actor paradigm for threading, language might be completely based on that
  • Functions as first class objects
  • Return to native code

I think the comment that many people are going to feel is wrong is the return to native code, I'm not saying that code will have to be compiled on different machines again. Instead, plugins to the kernels will all for running code itself. The kernel will understand how to execute the binary (much how the Linux kernel executes ELF binaries), this plugin would be the thing that handles the memory management for the user. At this point, there would be "VM" which can be tuned but instead it would be a return to applications running natively to the system. No more tuning JVM memory sizes, no more extra threads for handling GC.

Who knows, maybe the JVM will be the first ones to take advantage of the Kernel plugin and the Kernel plugin concept will be the thing that introduces "The Next Java."

Tuesday, December 10, 2013

New Paradigm of Issue Workflow

The big rage right now is the Agile development process, which I do like. The thing that I don't like is the inadequacy of issue states. This isn't just an issue within Agile, but one that plagues every workflow. Is an issue done? is it done done? Is it verified? How about deployed? What does it mean to reopen an issue? I face many of these questions on a daily basis, not just from others, but from myself as well.

Some where, in a dungeon at our office is a spreadsheet printed out which has an explanation of each of our 50 different states an issue can be in. I've also been lucky enough to try out the Greenhopper Simplified Workflow which simplifies the number of states dramatically. But it still doesn't capture enough states of an issue.

Basic States

State Description
Open Nobody has started working on this issue
Reopened The issue was done, but was not completed properly
In Progress Someone is currently working on this issue
Resolved Coding has been completed and the issue is done
Closed This issue was not actually an issue and has been closed

And this is fine for a starting point, but I don't believe that this actually encapsulates an entire development groups effort. Let's start with a basic workflow, a developer picks up an issue, works it, then passes it to their QA. QA then tests it, and sets it up for a UAT after which the UAT is approved and the issue is completed. Once the issue is completed, then issue is pushed to staging/pilot/production. Let's think about the possible states here.

Workflow separating QA and Development

State Description
Developer Todo A developer needs to pick this issue up to work on it
Developer In Progress A developer is currently working on this issue
QA Todo A QA needs to pick this issue up to work on it
QA In Progress A QA is currently working on this issue
Needs UAT Code and testing has been completed, this issue now needs a UAT
Completed The UAT was successful and has been approved by the product owner; at this point this issue is done

Let's think about another workflow that might happen. Let's assume that an developer picks up an issue and works it but gets blocked on some other issue. Let's assume that he gets unblocked and completes the issue sending it over to QA. Now let's assume that QA gets blocked on another problem. After getting unblocked they push it over for a UAT which then gets approved and the issue is pushed.

Separated Workflow with blocked state

State Description
Developer Todo A developer needs to pick this issue up to work on it
Developer In Progress A developer is currently working on this issue
Developer Blocked This issue is waiting on resolution of another issue
QA Todo A QA needs to pick this issue up to work on it
QA In Progress A QA is currently working on this issue
QA Blocked This issue is waiting on resolution of another issue
Needs UAT Code and testing has been completed, this issue now needs a UAT
Completed The UAT was successful and has been approved by the product owner; at this point this issue is done

Adding Closed States

But wait, there are a few different end states and outcomes of a specific issue; let's list a view different outcomes from issues.

  • Issue was deployed to production
  • Issue was transitive (fixed by a different issue)
  • We understand that this is an issue, but there are no plans to fix it
  • This is actually how the product works, so it is not an issue

Let's address these issues one at a time.

Issue was deployed into production

This is obviously the state we want all of our issues in. This, to me, is the definition of Completed

Issue was transitive (fixed by a different issue)

This is a special bucket where there was no actual code that fixed this issue that applies to the issue itself. So we should have another state of Closed where we can indicate why we did not actually complete it.

We understand that this is an issue, but there are no plans to fix it

This is one of those states that I hate to admit exist, the "Won't Fix" syndrome. But here is the issue, we shouldn't have a separate state for that, instead we should reuse our Closed state indicating why we didn't complete it. Communication is key, and this forces us to indicate why we aren't fixing it rather than just saying "no, we're not fixing it."

This is actually how the product works, so it is not an issue

This is a state in which we have to say "this is how this feature was designed" and I think that this falls into the previous section as well. Let's reuse our Closed state and indicate why we didn't complete it.

State Description
Developer Todo A developer needs to pick this issue up to work on it
Developer In Progress A developer is currently working on this issue
Developer Blocked This issue is waiting on resolution of another issue
QA Todo A QA needs to pick this issue up to work on it
QA In Progress A QA is currently working on this issue
QA Blocked This issue is waiting on resolution of another issue
Needs UAT Code and testing has been completed, this issue now needs a UAT
Completed The UAT was successful and has been approved by the product owner; at this point this issue is done
Closed No code was changed in this issue, and no further action will be taken on it.

Transitions

The idea is that an issue can move between any two states at any time except from a closed state. An issue cannot be reopened. But why not? Why shouldn't we be able to reopen the issue that was closed?

Reopening Issues (the bane of our existence)

Let's think about what is means when an issue is reopened. If an issue is being reopened, this means that a user feels that the issue was not actually completed correctly. However, if we look at the stages of the issue and what it means to be Completed, then what the user is saying is "I don't think, QA tested this right and our Product Owner didn't understand the requirements."

If we accept that is what reopening an issue means, then we can see that the issue was not implemented completed, but instead that the underlying people did not do their job appropriately. In this case, we should not say that the issue was incorrectly implemented and should be redone, but instead we should address the issue with the people and create a new issue to rectify the incorrect implementation.

But what does it mean to reopen a Closed issue? It means that the user feels that we've ignored their request and that we really don't care about their issue. But, for the most part, we're saying "I'm sorry, but that's not in the cards." As such, we should do our best to communicate why it's not going to happen. We shouldn't just put a single sentence saying "Not going to happen" but instead, link to another issue or some other piece of documentation which explains why we're saying no.

A New Paradigm

Although I'm not saying that Agile doesn't work, I think that we, especially as organizations, become to involved with how many states there are to hold different people accountable. And in the end, only a very few people understand what all of the individual states actually mean. That being said, we should also be able to give more information to our users in order to make them feel better about the actions we take on an issue.

My paradigm is to take more time to update issues and flush out reasoning behind the outcome of the issue so that things like Reopened can go away. That coupled with the 10 states listed in our table below will provide developers and QA a simple process of moving issues between states with more freedom and fewer questions like "is this a Won't Fix or Invalid?"

State Description
Developer Todo A developer needs to pick this issue up to work on it
Developer In Progress A developer is currently working on this issue
Developer Blocked This issue is waiting on resolution of another issue
QA Todo A QA needs to pick this issue up to work on it
QA In Progress A QA is currently working on this issue
QA Blocked This issue is waiting on resolution of another issue
Needs UAT Code and testing has been completed, this issue now needs a UAT
Completed The UAT was successful and has been approved by the product owner; at this point this issue is done
Closed No code was changed in this issue, and no further action will be taken on it.

Wednesday, January 30, 2013

Previous success is not tomorrows

One of the questions that continuously comes up by people I meet. It's a question that has come up every time I attend No Fluff Just Stuff, as for a shameless plug of Jay Zimmerman's symposium it's an amazing experience with some amazing developers in attendance, "what is going to be the next big language?" NFJS usually revolves around the Java programming language but almost always includes at least one seminar on an alternative language that lives on the JVM. The question is an interesting one, I'm sure that people who were originally working in C had the same questions. Will C++ become the future; or maybe this thing called Smalltalk or what about this thing HTML? Or how about this new thing called the JVM with the Java language?

Today has been a very eye opening experience for me; I've learned quite a bit about what makes "thought leaders." I'm really not one of those employees that drinks the Kool-Aid with the whole "employees are the most important thing" and "employees drive the business." Mainly because this simply isn't true. Yes, employees have an impact to perform their job and do their best to move the company in the direction that the company is already moving. But let's be realistic, customers drive the business; but employees do provide another level of "driving the business" which is being "thought leaders."

No, this is not going to be an anti-enterprise rant; it's a self reflection of how far I've come as an employee. I've begun to understand that I am a "thought leader" in our organization. Before you tune this out and say "oh how you've changed" let me explain how I've come to this conclusion. When we hear about "thought leaders" normally we think about someone who has done something outstanding in a specific business area, at least I did. But I've come to a more realistic understanding that these are actually "innovators." Instead, "thought leaders" are people who attempt to move an entire business area forward with specific concepts that "innovators" created. So for us, this is tantamount to using new technologies to solve our problems or design new solutions to our problems. This, of course, applies to almost everyone in the software development world; so what sets a regular software developer apart from a "thought leader?"

Myself, I'm beginning to stretch out and write a book on Functional Programming. I've been teaching people at my organization how to do functional programming for the better part of a year or more. People look to me not just for help on learning and implementing Functional concepts, but also for general design advice. And instead of just giving them the answer that I might see, I try to help them arrive at the decisions so that they can understand how to actually solve similar problems in the future. Being a central hub for people to seek advice of is a very important part of being a thought leader. When you are able to put the knowledge that you have and the ability to push this to hundreds, maybe even thousands, of other people you become a larger thought leader not just within your organization but within the community.

I talked briefly before about force multipliers. Now this is exactly the kind of thing that assists you in becoming a larger "thought leader" as well. I think about all of the decisions that I make on a daily basis. Then I think about how many people I influence based on my decisions, how many of them might say "you know what, that guy was right we should be doing it that way!" If I, or you, influence many people over the course of your career you should think about how many of them went to other companies or jobs and said "hey, we did XXX at my previous position and it worked out amazingly." This is what being a thought leader is all about. Influencing people in such a way that they will take decisions that you made with them and implement them later with the knowledge that it is a better decision. The important thing is, if they arrive at the same decision then they will inherently believe in the decision beyond just your convictions.

So what does all of this have to do with programming languages and which ones are the next greatest? Easy, think about all those people programming in C. Someone had to walk in and say "we're going to try and use Java." And most likely that happened at a couple of larger companies; as such, they would then get others interested in the Java language over things like C, C++, or Erlang and begin using that language. Those others would then take that knowledge to the next company that they worked for. Remember, larger companies usually churn more people through, and so the more people that get exposure are more likely to spread that knowledge in their future positions. What we can take from this is that the thing that made previous success is NOT going to be the thing that makes tomorrows success. Most likely at those larger companies someone made the statement, probably quite a few times, that "no we're not using Java, C has worked perfectly fine for years and we should stick with it." Eventually the lapse of time causes previous success to stagnate. Instead, we need thought leaders. We need people to push the boundaries and say "yea, that previous thing got us here; but it won't get us to where we want to go." It's important to understand that listening too much to thought leaders can cause a very large distraction; but when used in an appropriate manner, such as replacing an older existing technology, listen to your thought leaders to determine what the best new idea is not just something based on a previous success.

For those paying attention, you'll notice about half way through this blog I stopped quoting thought leaders. The reason being, much like I started out with this mentality of dealing with the enterprise organizational structure I thought it was ridiculous that we had things called "thought leaders." But as I've grown both as an employee and a software development professional, I've come to the realization that thought leaders are a vital part of the ecosystem making up the software development community. They are not the innovators, but instead are the influencers who can take that innovation and get people to understand and use it not just within the business they are at but at the businesses they will eventually move on to.

Monday, November 26, 2012

Functional programming by example

Functional Programming is becoming a really big concept; languages like Ruby and Python have gained quite a bit of popularity and introduce some functional concepts. Other languages like Erlang, Haskell and Lisp while having a decent developer base still remain as the "It's just really hard to write in" bucket of languages. Java has been such a powerful force in the software community in the past decade; as such, the JVM has become such a great, powerful, portable way of distributing software. On the JVM there are even more languages that have been introduced which contain functional concepts. Languages like JRuby, Jython, Groovy and Scala have opened the door of functional programming to many more people. Let's start by looking at a list of concepts which make up a functional language.
  • First-class and higher-order functions
  • Pure functions
  • Recursion
  • Strict versus non-strict evaluation
  • Statements
  • Pattern Matching
  • Immutable Variables
We'll look at some of these concepts and how we can use them to our advantage in real world software development. As we look at these concepts, we'll be looking at code examples as well. The list above is not an exhaustive list; and actually the last 4 are concepts that I wholeheartedly believe are key parts of functional programming. Some of these concepts can actually be achieved through imperative languages; so we'll talk about some of them. Without further ado, let's get started!

First-class and higher-order functions

This sounds like it's two different concepts; but really they are intertwined. First-Class functions are when functions are actually objects themselves. This means that you can send them to other functions or return them from functions which incidentally is what higher-order functions are. You can see these with Closures in Groovy such as below.
def incBy1 = { x -> x + 1 }
So now we have incBy1, and this is a function. The cool thing here is that we can now send this to another function to execute. But why would we do this? Well, let's assume that we have a function. It's job is to perform some operation to an integer and print out the value. Not very useful right? Don't worry it'll become more understandable in a minute. So let's see this function.
def printResultingNumberOperation(val, fn) {
        println(fn(val))
}
Now we just need to put it all together and see it running.
printResultingNumberOperation(1, incBy1)
Which results in 2 which really isn't that awesome. But in essence this is a perfect example of first class functions. Notice that the function incBy1 is it's own variable, so technically we could just execute it; and by it I mean execute the variable.
println incBy1(1)
Which has a resulting output of 2, this is probably less awesome than our previous example. Oh noes! But never fear, let's take it up a notch. printResultingNumberOperation is actually a higher-order function; why, because it takes a function as a parameter. Remember we said that higher-order functions are those that can take or return a function. Remember incBy1? Let's do some abstraction there; what if we just wanted an incBy2 or incBy3? Oh noes, we're going to be doing copy and paste! Do not do this, copy and paste is evil. Why is it evil? Because if you have a bug in incBy1; inherently you have a bug in incBy2 and incBy3! Well you'll just remember to update them right? Wrong, that never happens; I promise you, incBy2 and incBy3 will deviate from the updates to incBy1. And sadness will ensue. So let's abstract incBy and take a number. Check this out.
def incBy(num) {
        { x -> x + num }
}
Many of you might be asking "what are you doing?" Well, let's break this down. We are defining a function incBy that takes a single argument "num." And what does it return? Well, it returns a function (making incBy a higher-order function) and that function takes a parameter "x." It will then add "x" to "num" and return the result. It's important to understand that the "num" variable will be "closed over" and result in a "closure." Essentially, the returned function keeps "num" in scope even after the function incBy has returned. If that makes very little sense; let's see how to use this and that should make a bit more sense.
println incBy(2)
But wait, this prints out some garbage temp$_incBy_closure1@5815338! Right, remember that I said incBy will return a function; you're just seeing the toString. So what can I do with that? Well you can execute that function; remember above, it's a function that takes a single argument. So let's see what happens when we call it with 2 (hint, it should return 4).
println incBy(2)(2)
And what do we get, 4. Surprise; our function incBy that we called with 2 ended up with a function that did x + 2; and since we executed it with 2 we got 2 + 2 ending up with 4 as our return. So how is this useful? Well do you remember printResultingNumberOperation which takes a function which takes a number and increments it by some amount? Well how about we use it there!
printResultingNumberOperation(2, incBy(2))
And what do we get, 4; that is just amazing and life altering! I can write code which is reusable since I can abstract large chunks of my functions. This is one of the main things that a functional language must implement; if it does not have this, it is not a functional language.

Pure functions

Pure functions are functions that have no side effects. They are functions that purely execute on their parameters themselves. This allows the system to do more optimizations by either inlining code or even caching the results (since if what is put in is always the same coming out). To understand what pure functions are; we must first understand what side effects are. Let's look at some examples of side effects.
  • Output
  • Class Mutators
  • Parameter Mutators (really bad!)

Output

This is pretty straight forward; if you have a method, such as a logger or a database write; it's always non-pure. Why? Because the side effect of writing data to the console or to the database is something that changes every time it's called. Performing a log or database write is a side effect to calling the method.

Class Mutators

This is actually tied closely into immutable variables so I'm not going to get into Immutables until later. But the idea is still there; if you have a .setX(x) method which does exactly that "this.x = x" then this is a method with a side effect. The side effect is that we are mutating the current object. What is another option to make the function pure? Well let's say ".setX(x)" will create a new object and set the X during creation. Again, this ties into immutable variables so I'm not going to get too far into this.

Parameter Mutators (really bad!)

This one is really bad and also ties into immutable variables. Frankly this is one of those concepts that I choke on when I see someone violate. If you pass in a variable; and you try to change that parameter, you are doing it all wrong! Let us look at an example of this atrocity again in Groovy.
class Me {
        String fname

        def setFname(str) {
                this.fname = str
        }
}

new Me().setFname("Test")
So why is this so bad? Well this is actually a simple example which reflects a larger problem. What if I didn't actually want to change Me; why am I changing an input variable itself? Imagine if I had the following example.
class Me {
        String fname
}
def checkName(m, f) {
        if(m.fname != f) {
                m.fname = f
        }
}
def m = new Me()
checkName(m, "Test")
println(m.fname)
Well why is this bad? Because checkName is not clear in what it's doing. It's actually changing the value of my object that I'm sending in. This inherently means that I cannot trust my object that I send in. What should we have done instead?
class Me {
        String fname
}
def compareName(m, f) {
        if(m.fname != f) {
                new Me(fname: f)
        } else {
                m
        }
}
def m = compareName(new Me(), "Test")
println(m.fname)
So why is this better? Because I know that the Me I'm sending in cannot change. This means that either I can send a brand new Me or I can send another Me that has already been initialized and I'm assured that I have the original. Again, this goes to immutable variables; but it's important to understand that we want pure functions or functions that have no side effects.

Recursion

This is one of my favorite topics; not just because it makes peoples brains explode, but because most software developers are so scared of it they refuse to use it. Let's talk about the simple definition. A function that calls into itself to perform loops. So why does everyone find it difficult? Because you have to be correct with the end cases. The best thing to do is to define your end cases as soon as your begin writing your recursive function. But this isn't all; everyone has had to write a recursive function which recurses too deeply. Let's look at one example of recursion in Scala.
def fibonacci(i : Int) : Int = {
  if(i <= 0) { 
    0 
  } else { 
    i + fibonacci(i-1) 
  }
}
Notice that we have defined our end case (i <= 0) and we've performed general recursion. If I call this on 10, I get back 55; all seems non problematic until we try some large number. Let's say we want to find the Fibonacci number at the 10,000th spot. What happens? java.lang.StackOverflowError oh noes, I can't do that! So how do we get around this? Well; we use a technique called tail-recursion. It essentially keeps the recursive definition in code; but converts it to iteration at compile time. Now why do you do this? Well because recursion is one of the best ways to implement an algorithm (at least in my opinion) as it allows us to do much better and much more concise code within the algorithm. So what is tail recursion? Well, simply put; the call to the function (from within the function) should not require any further operation. Let's look at an example (again in Scala).
def fibonacci(i : Int, acc : Int = 0) : Int = {
  if(i <= 0) { 
    acc 
  } else { 
    fibonacci(i - 1, i + acc) 
  }
}
What happens if we call this with 10? We get 55, yay it still works. So what happens if we try 10,000? We get a ridiculous number of 5,0005,000; it didn't crash! So if we notice; the call to "fibonacci" has nothing that it relies on once it executes. Since the compiler understands this, it will convert this into an iterative loop. This allows us to write the code and utilize immutability, instead of in a for loop where you would have to have a counter that would need to mutate over time. Instead we pass the accumulation (state) back into the call of the function. Most functional languages will support this tail-call, languages like Lisp/Scheme, Scala, Erlang, and even C support this type of call. Other languages such as Groovy, it is not directly supported but instead is accomplished by using Trampolining. I won't get into it, but essentially you have a function that will call the function each time and wait for a specific end case "trampolining" between the actual function and the function maintaining it's state.

Strict versus non-strict evaluation

Strict evaluation is also called eager evaluation; non-strict is also called lazy evaluation. When we think about defining languages; we normally think about defining a variable.
def x = 10
This means that x is eagerly defined. So if we did something like below; we would expect x to be defined immediately.
def x = 10 * 10
So now, x is defined as 100; but what if we didn't want to have it evaluated immediately? What if that initialization was extremely costly? Well, we could either change it into a method call so that it would only be defined when we need it. But if we have to call it multiple times; we have to calculate it multiple times. Well certain languages like Groovy, as shown below, and Scala, as shown below that, allow you to define a variable as Lazy. This means that the variable is not actually defined until you use it.
class Me {
        @Lazy def o = [x()]

        static def x() {
                println("X Called")
                1
        }

}
println("Create")
def me = new Me()
println("Done Creating")
me.o.size()
println("Complete")
What is our output?
Create
Done Creating
X Called
Complete
Notice how "X Called" doesn't actually happen until we've actually called anything about o. Otherwise we do not execute the evaluation of o. Scala does the same kind of thing as shown below.
def x() = { 
  println("X called")
  1 
}
println("Create")
lazy val v = List(x())
println("Done Creating")
v
println("Complete")
Which gives us the exact same output.
Create
Done Creating
X Called
Complete
Again, this type of functionality is really useful for waiting before making large computations or executions that will define a variable until it's actually necessary.

Statements

Statements are very key to functional programming; at first glance, they seem like a useless predicate of a language; but when getting into immutable variables it becomes a very important part of the language itself. Statements are exactly that; every statement that occurs has (or should have) a return of some type. So take, for example, an if statement. Within functional programming an if statement should always return a type. Let's look at an example of a statement in Scala really quickly.
println(if(true) { 10 } else { 20 })
println(if(false) { 10 } else { 20 })
This gives us the output.
10
20
As we can see, the if statement itself actually has a return value. This is much like a ternary statement; you know the one that looks like
(true)?10:20
We also know that the last statement in a block is the return value of that block. check out this example of a block in Scala.
{
  val v = 10
  v + 20
}
This then returns 30 since v is 10, and 10 + 20 is ta-da 30 and this was the return of the last statement in the block. The reason why statements are so important is because we can do things like using function returns or, as we'll see, the return of an if statement or block to build an object. While this sounds crazy; it actually becomes simpler to understand the code over time. Let's see an example in Scala below.
class Test(str : String, length : Int) {}
So let's say that we want a method that generates a Test object; and if a null is passed in it should send back a Test with a blank string and a length of zero. How would we do this normally in an imperative manner?
def NewTest(str : String) = {
  val _str = if(str == null) {
    ""
  } else {
    str
  }
  new Test(_str, _str.length)
}
This is a good example of a statement, but let's try to remove the unnecessary variable.
def NewTest(str : String) = {
  if(str == null) {
    new Test("", 0)
  } else {
    new Test(str, str.length)
  }
}
Now this is really nasty; now we have to maintain two different branches where we create a new Test. So let's make the composition of Test be statements.
def NewTest(str : String) = {
  new Test(
    if(str == null) {
      ""
    } else {
      str
    }, 
    if(str == null) {
      0
    } else {
      str.length
    }
  )
}
So notice that the creation of Test is only defined once; it is composed of statements to build it's components. If we were to extend the if statement and add another branch; it would probably be best to rip those out into their own functions. Let's say, for example, that we just wanted to do something if it was null; maybe we can do a higher order function here?
def NewTest(str : String) = {
  def handleStr[T](op : String => T) : T = {
    op( if(str == null) { "" } else { str })
  }
  new Test(handleStr(x=>x), handleStr(x=>x.length))
}
This works out really well because we can just modify handleStr to do any extra checking in the future. And notice that we never use a variable and instead we are able to let handleStr deal with the edge cases for us.

Pattern Matching

Pattern matching is one of those topics that either people understand or they just miss the boat on what it is designed to do. There are plenty of uses for pattern matching and we'll look at a few of them in this section. For this section we'll be looking at examples in Scala.

Basic Functionality

We're going to start with something that is extremely reminiscent of a switch statement. We'll start with a boolean and look at a true, false, and everything else case.
true match {
  case true => "We're True"
  case false => "We're False"
  case _ => "Something that was not true or false"
}
From here, we will end up with the string "We're True". Now this seems odd, but if we assume that we just care if the match was true, otherwise we assume it's false we can do this instead.
true match {
  case true => "We're True"
  case _ => "Was false or something else"
}
Now what about a numerical value?
0 match {
  case 0 => true
  case _ => false
}
We now have a way to do simple matches and help us determine if our value was 0 or something else. Now this might now seem very interesting; but let's see what happens if we have a String and we want to make sure we have a valid string (non-null).
"string" match {
  case null => ""
  case str : String => str
}
Now why is this important? Because, remember we want to use as few variables as possible, you can do a match for a function return and get the string without storing it.
stringOperation("string") match {
  case null => ""
  case str : String => str
}
As we see, if stringOperation returns null we will end up with an actual valid blank string which is operable. If we get a string, then we'll just return that. So now we get the very basics of pattern matching.

Extracting Attributes

One of the general uses for pattern matching is to extract certain attributes from classes. Pattern matching is really useful when matching against lists. We'll look at list examples for now; and to start out we're going to see a basic list and we'll take the head element off of the list.
List(1, 2, 3) match {
  case List() => -1
  case x :: xs => x
}
So here we can see that we're looking for an empty list; and if we get an empty list we will return a -1. If it's not an empty list; we will extract the first element from the list (the head element) and return it. This usage of "::" is an unapply which allows us to extract the attribute. What is interesting here is that the :: will extract the head element in an x variable and the rest of the list into the xs variable which become available to the right of the => which is the body of the case statement. Now what is really cool is that we can actually extract more than just once!
List(1, 2, 3) match {
  case List() => -1
  case x ::  y :: xs => y
}
Now we're going to expect to get 2 since we extracted 1 into x, 2 into y, and List(3) into xs. Now if we look at this; there is clearly a missing case, x :: xs (which could also be x :: List(). So on compilation we end up with a warning warning: match is not exhaustive! which will let us know that we should extend our matches so that we don't fall off the end. Now for the mind killing part; you can actually use literals to extract certain parts of match.
List(1, 2, 3) match {
  case List() => -1
  case 1 ::  x :: 3 :: xs => x
}
And from this output we get 2; notice how we extracted the 2nd element by using the variable x and indicating where we wanted to rip it out from. This is some of the more general usages of pattern matching; next we'll look at case classes and how they are used to pass messages.

Case Classes

One of the major selling points of pattern matching is the ability to extract objects themselves. In Scala these are case classes which we'll look into. The basic concept is that a case class can be used to extract attributes from an object. Let's look at a very simple example.
case class MyObj(str : String, len : Int)
new MyObj("Foo", "Foo".length) match {
  case MyObj(str, len) => println(str + "@" + len)
}
As we can see, we match on the fact that we have an object and we are able to extract all of the attributes from the object itself. So the big question is; why do we care? We can access the attributes from the object anyway. Well here is the thing; we can use inheritance to do an extraction based on the child class.
trait MyTrait
case class MyObj1(str : String, len : Int) extends MyTrait
case class MyObj2(num : Long) extends MyTrait
def exec(in : MyTrait) : Long = in match {
  case MyObj1(_, len) => len.toLong
  case MyObj2(n) => n
}
So now what happens if we call it; more specifically, what if we call it with MyObj1 and how does it differ from MyObj2? Let's look at MyObj1 first.
exec(new MyObj1("Foo", "Foo".length))
So what do we get? Well we're going to get 3 as a return. Why? Because the match succeeded on the case MyObj1(_, len)! So what happens if we call this again with MyObj2?
exec(new MyObj2(22))
This gives us 22; which we kind of expected by now. This means that we can extract attributes of an object as we enter a function; or choose which function to execute based on the type of the object passed in; again, all without having to store any state.

Immutable Variables

This is a concept that is not new to programming; it is also not specific to functional programming. Think about this statement from C.
const char *str = "MyString";
What does this mean? Well, it means that the variable str cannot change after being set. Well why does this matter? Personally I think that this goes to the very heart of what functional programming is. When we think about functional programming; we think of returns from functions being sent directly as a parameter to another function. If we think about functional programming like this; then we can assume that a return from a function would not be changed before it was passed into another function. If this is true; then we can, for the most part, not need any variables whatsoever.
But of course, there are cases when a variable might be returned and we will need to pass individual components of the return to other functions. In these instances, we will need to store the variable such that we aren't doing the calculation that resulted in the return multiple times. In doing so, we should not make it possible to modify the variable in it's transitory state. Think about it; if we did multiple threads, and each of those threads were touching a variable that IS mutable; then it's possible that one thread could be modifying the variable at the same time another thread is trying to read it.
So now, let's think about this example; if I know that my object A cannot be modified (all of the components of the object are immutable). Then I also know that I can pass it to any method/function and know for a fact that I cannot change. This means that I can technically make multiple calls (if possible) against that variable concurrently.

Summary

As a quick summary, functional programming is all about purity in programming. Having pure functions as well as data that cannot mutate; only create new instances with mutated data. This type of programming ensures that bugs do not come from concurrent modifications; or modifications of variables where they shouldn't have modified. It also means that functions can be cached much easier especially with things like pure functions. Overall, using functional programming allows developers to be more expressive and thus be more intuitive to others picking up products from other programmers.
I hope that people find this interesting and help people to better understand what exactly functional programming means and how to start working programming in it. Remember, just because a language isn't setup to be functional (languages like Java or C) doesn't mean that you can't accomplish the same concepts. Although it's going to be much more painful and difficult to implement than actual functional languages (implementing higher order functions with interfaces and anonymous classes; an example is the Comparator interface and usage of it) it still has the same effect of good function re-use!

Thursday, November 22, 2012

Lift, a developers introspective

I've been using Lift for about a six months now; we used it to rewrite one of our internal C/Gtk applications. This is an introspection into our usages of Lift in enterprise software. For this post, I'm going to try and answer some general questions that I had going in as well as just some general questions that have come up from others working on the project.
  • Was it a new technology? How was the adoption of it?
  • How is view first for large systems?
  • How do snippets work out?
  • Did you keep mapper or switch?
  • How did you do UnitTests
  • Did you end up using the RESTful framework?
  • How did sitemap hold up with authentication?
So buckle up and let's get started into some of these questions.

Was it a new technology? How was the adoption of it?

Well, Lift itself had already been around for a little bit and Scala for a bit more than that. I got started into Scala during my M.S. at Depaul and was fascinated by the functional aspect of it. By that time, Scala had already achieved a version 2.8.x so again, it had been around for a bit. Both Lift and Scala were new technologies that we added to our stack.
I had added one small application in Scala previously; it was important because I was able to make changes with very minimal work. My boss at the time loved this, mainly because he could ask for something and I could pump out the work in a few minutes. I had spoken with him before (and his boss) about rewriting our entire C/Gtk application base into Web Applications. This ended up being my way in; I offered to do a simple rewrite using this framework and language. I then went in and did about 40% of the work and came back to him with it. Showing him that we could rewrite it with ease. He then agreed and had me write up some documentation to ask the team who would eventually take the product over from us if Lift + Scala were ok.
I got a chance to work with some of the people on the team taking the product before the team made its final decision. Each person told me that they wished they could work in Scala all the time. Not only that, but each of them have found it wonderful working in Lift. We had to adopt using Maven (it was either this or go with ant/ivy itself) since sbt was ruled out. It was great how quickly our project (utilizing Maven) plugged into IntelliJ. We started in ScalaIDE until we found that an index would be performed evertime you hit save. And this caused Eclipse to eventually run out of threads to generate these indexes. It was a known bug, but we didn't have a choice at this point and I switched us over to IntelliJ. Of course, IntelliJ has it's own issues; but for the most part I haven't seen many of them.
Overall, the adoption has been fantastic. The RestHelper is just mind bogglingly simple; and using pattern matching to create understandable RESTful resources. My co-workers came from a Spring background; and so this was a very different way of looking at services, but once they began to understand how it was setup; it is so much simpler and much easier. As I'll mention later, we switched to Squeryl for the ORM; and everyone freaked out about how SQL-ish the commands were. Making it easy to write the SQL statements while keeping type safety. The XML being a native datatype was just so nice for creating the services. Now the one downside that I had with this was that Record did NOT have an .asXml method; which meant that if you wanted to serialize Objects from the database you would have to manually create the XML. I figured that this would be a pain; and so I decided that I would extend Record to do this for me (by using the same functionality that .asJson does). I submitted a bug for this to the Lift group but haven't heard anything back.

How is view first for large systems?

This is an interesting question and I'm still not totally sure about this one. I've seen some really awesome abilities with performing multiple layers of embeds. But I've also seen some necessity to put out XML (XHTML) within the snippet calls. This is actually really awesome stuff where you get verified XML that will be embedded as it gets substituted. One thing that I've done in a newer project was to call into a snippet with a template XHTML section; I can then (since Scala does not modify the actual original XML on a Bind call) recall bind for each of my records that I want to apply my template to. It's really amazing what can be accomplished when you start realizing that Lift tries to keep the immutable variable idea.
Regardless, having multiple levels of embeds, and not having any business logic within the views themselves means that modifying layouts is simple. We then reduce the business logic back to the snippets to handle. One of the things that we did was to use a "loggedin" snippet call; at first I was thinking "oh man, I'm introducing business logic into the view." But actually I'm not, because the business logic of how "loggedin" is processed is kept in the Snippet. Now some people may use snippets like this in things like PHP or Grails or Rails by using something like below. And that might be fine; but eventually session.user.loggedIn becomes session.user.loggedIn || session.user.isAdmin.
<% if(session.user.loggedIn) { %>

How do snippets work out?

Snippets end up working the same way that taglibs work in Grails. They seem to work out really nicely. I find it really useful to treat them as template fillers. I find it useful in some instances to generate other XHTML (for example I have some code that decides when to send certain javascript functions because it's optional based on the user logging in) but on the whole I use it to populate my pages themselves. In some instances I'm using it to actually do Lift-y things and do lift callbacks for Javascript executions. When we started this project we all came from an MVC background which meant that we created snippets such as "*Show" or "*Index." If I could go back, I would've treated the snippets as truly encapsulated pieces. If I had to give anyone advice about using Lift it would be encapsulate the calls into snippets rather than treating each one as instead of doing one per page!

Did you keep mapper or switch?

We started with mapper until we found a composite key record. Mapper fought with me a bit and I finally wrangled it. Then I ran into another DB type where we had a composite key and we had an auto increment field that was NOT part of the PK. This is a completely insane concept and I wish I could change it; however, it is not within my power to change it. As such, Mapper just completely failed at this (and rightly so, if you have an AI field it should be your PK :( ). So I went through and looked at a couple of different ORM choices. The first was, obviously, hibernate (we already used this as our organization); I put it on the background mainly because I've heard some pains with hibernate and transactions/sessions as well as it wasn't pure Scala. So I started looking at pure Scala options. The first I came across was Squeryl; I checked it out, seemed pretty easy to get embedded into Lift but ran across some issues with defining AI fields that are not part of the primary key. However, it was much simpler to use than Mapper and since we were not doing CRUD style applications it made more sense especially since it had it's own DSL that was reminiscent of SQL itself. The final one that I looked at was Circumflex. I really liked the syntax, especially the definitions of objects and how they looked exactly like SQL create statements.
Regardless, we switched to Squeryl which took a little bit of work to accomplish; but on the whole it ended up being really awesome to work with. The ability to write queries that actually felt like queries but were type safe and still utilized prepared statements. My one complaint, as I mentioned above, was using Record meant that we did not get an ".asXml" component much like there is an ".asJson" option. So I went in and created it myself as a trait. It was really nice having the ability to do an ".map(_ asXml)" which I could then send back as a RestHelper response. I had created a github pull request for this and also pinged the Lift Google Group to see if it could be added. Hopefully they do; I think it would be a great compliment to the RestHelper to be used for XML Rest Services.

How did you do UnitTests

We actually ended up using ScalaTest. We did incorporate Cobertura as well for code coverage. However, we, as with anyone using Cobertura in Scala appears to see, you cannot get > 50% branch coverage. We have > 90% line coverage and about 850 UnitTests which is just awesome for the small amount of code we've actually written for our application itself. We had to use the JRunner interface since the maven-scalatest-plugin was not available at the time and is still kind of in a beta of sorts. My suggestion is just to go ahead and stick with ScalaTest using the JRunner interface; this way you can actually plug into tools like Jenkins and get UnitTest information into your reports.

Did you end up using the RESTful framework?

We did; as a matter of fact we ended up using the jqGrid plugin quite a bit; so we ended up making massive use in the RestHelper RESTful frameworks. It was fantastic how easy it was to create new resources and more specifically how simple it was to add resources. We actually ended up creating a list in our Boot class that contained all of the Rest objects; we then did a foreach on that list and added them with our guard (to protect our rest services from unauthenticated users) to the stateful dispatch. This would be something I would suggest to everyone if they are looking to do statefulDispatch rest services that require authentication. Setup a list of them and do a foreach on them to add them to the dispatch and do the guard. This way adding a new object is just adding it to a list.

How did sitemap hold up with authentication?

The sitemap itself was a little bit awkward at first to understand. Once you get the hang of "oh, after the slash is the filename without the .html appended" then it's pretty straight forward. The interesting thing about the SiteMap is that if you create the SiteMap manually you can specify a partial function to perform a check for users. I will say; if you do this, make sure to remember that there are a few different pages that shouldn't require authentication! Your login page, a logout page, your primary error page, your primary page missing page. Make sure in your partial function you are checking for that!

Summary

If I had to do it again here is a list of things that I would've done differently.
  • Converted to Squeryl.
  • Proper encapsulation in my snippets.
  • Used the HTML5 parser rather than the XHTML parser.
  • Done less in pure JS and did a bit more Lift-ing.
  • More Lazy Loading and Parallel Loading
  • Do not use as much Scala shorthand (this is merely for other users coming in)
I'm probably forgetting some, but these are the immediate ones that stand out to me as lessons that I learned writing a Lift Web Application. I hope others can see this and maybe take something away from it as things that will make a transition to Lift a bit easier. If you have questions I can do my best to answer them if I've run across them in the past; or if you have any experiences of your own let me know!
I would like to say thanks to the Lift Google group for being open with me when I had questions!

Thursday, July 26, 2012

MySQL/JDBC DateTime woes

The Situation

At my current company we store lots of timestamps; more specifically they are usually global timestamps. So clearly we needed to store the field as UTC. Why? Because we dozens of SQL servers which means we need a way to make sure that no matter where the servers are, the timestamps are always readable and understandable. Given this situation it has always been our idea to store the times in UTC. The biggest player in this is that JDBC attempts to convert timestamps for you so that you will always be in a local time. But what happens when you don't want to convert to a local time? For example, let's say you have two databases (for failover reasons) and you have two separate webapp servers that read from each of those databases (again for failover reasons except that the database primary is not bound to the web application primary). This means that you could have two sites one in India and one in Indiana and the India web app server reading from the Indiana database. So the conversion is never a constant thing. To add difficulty to this situation, let's assume that you have a group within your organization who has enough pull to state that these times must remain in UTC since that is what they are used to dealing with.

The Theory

Our solution starts by stating that we must keep the data in the database always be in UTC. Let's assume that the database servers are setup to be in UTC; so they are assumed to be UTC regardless. So let's digest the situation, we can look at this in 2 parts, the first is storing the timestamp in UTC. The second is retrieving the timestamp and converting it back into UTC from local (remember that JDBC is going to convert it to local for us).

How to store the timestamps in UTC?

The first part of this is knowing that there are two ways that we can get a timestamp to store. The first is to create a timestamp representing now. For this, we can just use a Calendar and store that. Simple enough, remember that it will be converted over so it'll be converted to now in UTC. The second is to create a timestamp based on input from a user. Here is where the difficulty lies, let's assume that our user wants to store "2012-02-02 00:00:00" but not based on a localtime instead they want it based on UTC. Well we can accomplish this two ways, both involve creating a SimpleDateFormat object and using it to parse the time. The first way is to append the "z" format symbol to get the timezone from the user. The key here is to just append the "UTC" string on parse.
SimpleDateFormat obj = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss z");
Date date_in_utc = obj.parse(str + " UTC");
So what is the other option? Well just set the timezone for the formatter of course.
SimpleDateFormat obj = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
obj.setTimeZone(TimeZone.getTimeZone("UTC"));
Date date_in_utc = obj.parse(str);
So why would I go with either of these options? They both have their pluses and minuses. The first option really allows us configuration. Let's say that we normally do UTC but we want to allow the user to specify their own timezones. Here we have that ability. But it's slower since we're going to be compiling the SimpleDateFormat each time and appending the "UTC" string each time. The second option means that we can compile the SimpleDateFormat once with it's UTC timezone set and not have to do any string appends when we want to perform the conversion.

How to get the timestamps in UTC?

Remember, we said that we are always going to get the timestamp in UTC and JDBC is going to convert the timestamp to the local timezone. So since we know that the time is in local time; how are we going to deal with this? Well let's just look at the display. Let's assume that we're going to display the time in UTC. How do we do it? The same way as in the previous step, except instead of doing a parse we do a format().toString as shown below.
SimpleDateFormat obj = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
obj.setTimeZone(TimeZone.getTimeZone("UTC"));
String date_in_utc = obj.format(str).toString;

Thursday, July 5, 2012

I can't get anything done (how you're doing more than you think)

I can't get anything done

I feel this almost everyday now. I just feel like I cannot accomplish anything during the day. To be honest, I have between 3-4 different projects going at any given time and every day I go home I feel like I haven't actually made any progress. Most of my time is spent either answering questions from people, or teaching them concepts/ideas that I've learned over time. Most people stop here and get angry that they just aren't accomplishing anything. Most people begin writing blogs and tweets about how they aren't getting anything done. The thing is, you're actually doing more than you think!

How you're doing more than you think

Obviously there are certain instances where this isn't the case; but if you're like me and answering questions and helping other people learn concepts and ideas that you've learned, you're actually doing more than you think. Many people understand the concept of the "force multiplier," the idea that if you train N people to work 50% of your capacity, you are actually performing N*0.5 work. This is obviously pretty straight forward; but it's important to reflect and realize what it means.

Why am I writing this?

I think many times we, especially as developers, get caught up int he concept that we must be programming to accomplish anything. Yet sometimes, it's the knowledge that we disperse that makes us more valuable than just churning out large chunks of code all the time. The next time you think you're not getting anything done, take a minute and think about everything else that you did and reflect on the fact that you're getting more done than you think!