Thursday, November 22, 2012

Lift, a developers introspective

I've been using Lift for about a six months now; we used it to rewrite one of our internal C/Gtk applications. This is an introspection into our usages of Lift in enterprise software. For this post, I'm going to try and answer some general questions that I had going in as well as just some general questions that have come up from others working on the project.

Was it a new technology? How was the adoption of it?
How is view first for large systems?
How do snippets work out?
Did you keep mapper or switch?
How did you do UnitTests
Did you end up using the RESTful framework?
How did sitemap hold up with authentication?

So buckle up and let's get started into some of these questions.

Was it a new technology? How was the adoption of it?

Well, Lift itself had already been around for a little bit and Scala for a bit more than that. I got started into Scala during my M.S. at Depaul and was fascinated by the functional aspect of it. By that time, Scala had already achieved a version 2.8.x so again, it had been around for a bit. Both Lift and Scala were new technologies that we added to our stack.
I had added one small application in Scala previously; it was important because I was able to make changes with very minimal work. My boss at the time loved this, mainly because he could ask for something and I could pump out the work in a few minutes. I had spoken with him before (and his boss) about rewriting our entire C/Gtk application base into Web Applications. This ended up being my way in; I offered to do a simple rewrite using this framework and language. I then went in and did about 40% of the work and came back to him with it. Showing him that we could rewrite it with ease. He then agreed and had me write up some documentation to ask the team who would eventually take the product over from us if Lift + Scala were ok.
I got a chance to work with some of the people on the team taking the product before the team made its final decision. Each person told me that they wished they could work in Scala all the time. Not only that, but each of them have found it wonderful working in Lift. We had to adopt using Maven (it was either this or go with ant/ivy itself) since sbt was ruled out. It was great how quickly our project (utilizing Maven) plugged into IntelliJ. We started in ScalaIDE until we found that an index would be performed evertime you hit save. And this caused Eclipse to eventually run out of threads to generate these indexes. It was a known bug, but we didn't have a choice at this point and I switched us over to IntelliJ. Of course, IntelliJ has it's own issues; but for the most part I haven't seen many of them.
Overall, the adoption has been fantastic. The RestHelper is just mind bogglingly simple; and using pattern matching to create understandable RESTful resources. My co-workers came from a Spring background; and so this was a very different way of looking at services, but once they began to understand how it was setup; it is so much simpler and much easier. As I'll mention later, we switched to Squeryl for the ORM; and everyone freaked out about how SQL-ish the commands were. Making it easy to write the SQL statements while keeping type safety. The XML being a native datatype was just so nice for creating the services. Now the one downside that I had with this was that Record did NOT have an .asXml method; which meant that if you wanted to serialize Objects from the database you would have to manually create the XML. I figured that this would be a pain; and so I decided that I would extend Record to do this for me (by using the same functionality that .asJson does). I submitted a bug for this to the Lift group but haven't heard anything back.

How is view first for large systems?

This is an interesting question and I'm still not totally sure about this one. I've seen some really awesome abilities with performing multiple layers of embeds. But I've also seen some necessity to put out XML (XHTML) within the snippet calls. This is actually really awesome stuff where you get verified XML that will be embedded as it gets substituted. One thing that I've done in a newer project was to call into a snippet with a template XHTML section; I can then (since Scala does not modify the actual original XML on a Bind call) recall bind for each of my records that I want to apply my template to. It's really amazing what can be accomplished when you start realizing that Lift tries to keep the immutable variable idea.
Regardless, having multiple levels of embeds, and not having any business logic within the views themselves means that modifying layouts is simple. We then reduce the business logic back to the snippets to handle. One of the things that we did was to use a "loggedin" snippet call; at first I was thinking "oh man, I'm introducing business logic into the view." But actually I'm not, because the business logic of how "loggedin" is processed is kept in the Snippet. Now some people may use snippets like this in things like PHP or Grails or Rails by using something like below. And that might be fine; but eventually session.user.loggedIn becomes session.user.loggedIn || session.user.isAdmin.

<% if(session.user.loggedIn) { %>

How do snippets work out?

Snippets end up working the same way that taglibs work in Grails. They seem to work out really nicely. I find it really useful to treat them as template fillers. I find it useful in some instances to generate other XHTML (for example I have some code that decides when to send certain javascript functions because it's optional based on the user logging in) but on the whole I use it to populate my pages themselves. In some instances I'm using it to actually do Lift-y things and do lift callbacks for Javascript executions. When we started this project we all came from an MVC background which meant that we created snippets such as "*Show" or "*Index." If I could go back, I would've treated the snippets as truly encapsulated pieces. If I had to give anyone advice about using Lift it would be encapsulate the calls into snippets rather than treating each one as instead of doing one per page!

Did you keep mapper or switch?

We started with mapper until we found a composite key record. Mapper fought with me a bit and I finally wrangled it. Then I ran into another DB type where we had a composite key and we had an auto increment field that was NOT part of the PK. This is a completely insane concept and I wish I could change it; however, it is not within my power to change it. As such, Mapper just completely failed at this (and rightly so, if you have an AI field it should be your PK :( ). So I went through and looked at a couple of different ORM choices. The first was, obviously, hibernate (we already used this as our organization); I put it on the background mainly because I've heard some pains with hibernate and transactions/sessions as well as it wasn't pure Scala. So I started looking at pure Scala options. The first I came across was Squeryl; I checked it out, seemed pretty easy to get embedded into Lift but ran across some issues with defining AI fields that are not part of the primary key. However, it was much simpler to use than Mapper and since we were not doing CRUD style applications it made more sense especially since it had it's own DSL that was reminiscent of SQL itself. The final one that I looked at was Circumflex. I really liked the syntax, especially the definitions of objects and how they looked exactly like SQL create statements.
Regardless, we switched to Squeryl which took a little bit of work to accomplish; but on the whole it ended up being really awesome to work with. The ability to write queries that actually felt like queries but were type safe and still utilized prepared statements. My one complaint, as I mentioned above, was using Record meant that we did not get an ".asXml" component much like there is an ".asJson" option. So I went in and created it myself as a trait. It was really nice having the ability to do an ".map(_ asXml)" which I could then send back as a RestHelper response. I had created a github pull request for this and also pinged the Lift Google Group to see if it could be added. Hopefully they do; I think it would be a great compliment to the RestHelper to be used for XML Rest Services.

How did you do UnitTests

We actually ended up using ScalaTest. We did incorporate Cobertura as well for code coverage. However, we, as with anyone using Cobertura in Scala appears to see, you cannot get > 50% branch coverage. We have > 90% line coverage and about 850 UnitTests which is just awesome for the small amount of code we've actually written for our application itself. We had to use the JRunner interface since the maven-scalatest-plugin was not available at the time and is still kind of in a beta of sorts. My suggestion is just to go ahead and stick with ScalaTest using the JRunner interface; this way you can actually plug into tools like Jenkins and get UnitTest information into your reports.

Did you end up using the RESTful framework?

We did; as a matter of fact we ended up using the jqGrid plugin quite a bit; so we ended up making massive use in the RestHelper RESTful frameworks. It was fantastic how easy it was to create new resources and more specifically how simple it was to add resources. We actually ended up creating a list in our Boot class that contained all of the Rest objects; we then did a foreach on that list and added them with our guard (to protect our rest services from unauthenticated users) to the stateful dispatch. This would be something I would suggest to everyone if they are looking to do statefulDispatch rest services that require authentication. Setup a list of them and do a foreach on them to add them to the dispatch and do the guard. This way adding a new object is just adding it to a list.

How did sitemap hold up with authentication?

The sitemap itself was a little bit awkward at first to understand. Once you get the hang of "oh, after the slash is the filename without the .html appended" then it's pretty straight forward. The interesting thing about the SiteMap is that if you create the SiteMap manually you can specify a partial function to perform a check for users. I will say; if you do this, make sure to remember that there are a few different pages that shouldn't require authentication! Your login page, a logout page, your primary error page, your primary page missing page. Make sure in your partial function you are checking for that!

Summary

If I had to do it again here is a list of things that I would've done differently.

Converted to Squeryl.
Proper encapsulation in my snippets.
Used the HTML5 parser rather than the XHTML parser.
Done less in pure JS and did a bit more Lift-ing.
More Lazy Loading and Parallel Loading
Do not use as much Scala shorthand (this is merely for other users coming in)

I'm probably forgetting some, but these are the immediate ones that stand out to me as lessons that I learned writing a Lift Web Application. I hope others can see this and maybe take something away from it as things that will make a transition to Lift a bit easier. If you have questions I can do my best to answer them if I've run across them in the past; or if you have any experiences of your own let me know!
I would like to say thanks to the Lift Google group for being open with me when I had questions!

Thursday, July 26, 2012

MySQL/JDBC DateTime woes

The Situation

At my current company we store lots of timestamps; more specifically they are usually global timestamps. So clearly we needed to store the field as UTC. Why? Because we dozens of SQL servers which means we need a way to make sure that no matter where the servers are, the timestamps are always readable and understandable. Given this situation it has always been our idea to store the times in UTC. The biggest player in this is that JDBC attempts to convert timestamps for you so that you will always be in a local time. But what happens when you don't want to convert to a local time? For example, let's say you have two databases (for failover reasons) and you have two separate webapp servers that read from each of those databases (again for failover reasons except that the database primary is not bound to the web application primary). This means that you could have two sites one in India and one in Indiana and the India web app server reading from the Indiana database. So the conversion is never a constant thing. To add difficulty to this situation, let's assume that you have a group within your organization who has enough pull to state that these times must remain in UTC since that is what they are used to dealing with.

The Theory

Our solution starts by stating that we must keep the data in the database always be in UTC. Let's assume that the database servers are setup to be in UTC; so they are assumed to be UTC regardless. So let's digest the situation, we can look at this in 2 parts, the first is storing the timestamp in UTC. The second is retrieving the timestamp and converting it back into UTC from local (remember that JDBC is going to convert it to local for us).

How to store the timestamps in UTC?

The first part of this is knowing that there are two ways that we can get a timestamp to store. The first is to create a timestamp representing now. For this, we can just use a Calendar and store that. Simple enough, remember that it will be converted over so it'll be converted to now in UTC. The second is to create a timestamp based on input from a user. Here is where the difficulty lies, let's assume that our user wants to store "2012-02-02 00:00:00" but not based on a localtime instead they want it based on UTC. Well we can accomplish this two ways, both involve creating a SimpleDateFormat object and using it to parse the time. The first way is to append the "z" format symbol to get the timezone from the user. The key here is to just append the "UTC" string on parse.

SimpleDateFormat obj = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss z");
Date date_in_utc = obj.parse(str + " UTC");

So what is the other option? Well just set the timezone for the formatter of course.

SimpleDateFormat obj = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
obj.setTimeZone(TimeZone.getTimeZone("UTC"));
Date date_in_utc = obj.parse(str);

So why would I go with either of these options? They both have their pluses and minuses. The first option really allows us configuration. Let's say that we normally do UTC but we want to allow the user to specify their own timezones. Here we have that ability. But it's slower since we're going to be compiling the SimpleDateFormat each time and appending the "UTC" string each time. The second option means that we can compile the SimpleDateFormat once with it's UTC timezone set and not have to do any string appends when we want to perform the conversion.

How to get the timestamps in UTC?

Remember, we said that we are always going to get the timestamp in UTC and JDBC is going to convert the timestamp to the local timezone. So since we know that the time is in local time; how are we going to deal with this? Well let's just look at the display. Let's assume that we're going to display the time in UTC. How do we do it? The same way as in the previous step, except instead of doing a parse we do a format().toString as shown below.

SimpleDateFormat obj = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
obj.setTimeZone(TimeZone.getTimeZone("UTC"));
String date_in_utc = obj.format(str).toString;

Thursday, July 5, 2012

I can't get anything done (how you're doing more than you think)

I can't get anything done

I feel this almost everyday now. I just feel like I cannot accomplish anything during the day. To be honest, I have between 3-4 different projects going at any given time and every day I go home I feel like I haven't actually made any progress. Most of my time is spent either answering questions from people, or teaching them concepts/ideas that I've learned over time. Most people stop here and get angry that they just aren't accomplishing anything. Most people begin writing blogs and tweets about how they aren't getting anything done. The thing is, you're actually doing more than you think!

How you're doing more than you think

Obviously there are certain instances where this isn't the case; but if you're like me and answering questions and helping other people learn concepts and ideas that you've learned, you're actually doing more than you think. Many people understand the concept of the "force multiplier," the idea that if you train N people to work 50% of your capacity, you are actually performing N*0.5 work. This is obviously pretty straight forward; but it's important to reflect and realize what it means.

Why am I writing this?

I think many times we, especially as developers, get caught up int he concept that we must be programming to accomplish anything. Yet sometimes, it's the knowledge that we disperse that makes us more valuable than just churning out large chunks of code all the time. The next time you think you're not getting anything done, take a minute and think about everything else that you did and reflect on the fact that you're getting more done than you think!

Monday, May 7, 2012

Why Service Oriented Architecture Doesn't Work Internally?

I know I can hear a bunch of clicky keyboards starting to clack as tons of readers are composing a response on their Model-M keyboards to tell me I'm a complete moron. Well, that might be; but before you do let me at least make it through my point. I guess I should add one caveat here; I'm not saying that SoA doesn't work in certain ways. Instead, I'm making the point that using SoA for your internal applications does not work. I'll start by explaining my current situation where we are converting to SoA and how it is proving a complete failure for our internal software. Next, I'll explain why SoA does not work and how the problems SoA attempts to fix has been superseded. Finally I'll explain a better architecture which does contain SOME SoA for external clients and allows for faster response time for applications.

As promised, firstly I will explain my current situation. So we have a MySQL data storage on the backend and up to 100+ applications pounding on it at once. Our MySQL data storage is sharded quite a bit making up 63 hosts, 33 of which are replicated in a master-master replication scenario. There is an average of about 166 databases on each of the hosts. That's a ton of databases; and a ton of data on the backend. On average there are about 10,000,000 queries per day over all the databases/hosts. Now many of the databases can be normalized further (or normalized for the first time) and can be streamlined to actually take advantage of how the MySQL could do better.

Now that we've begun to understand the architecture, let's look at how SoA is being implemented. We're beginning to abstract many of the old database queries into REST-ful service calls. One of the architectures that is one of the newer and "better." This sounds all good and dandy; so let's examine what SoA is supposed to accomplish. SoA is implemented to abstract the data layer; this allows the data layer to change without having to change the applications which are talking to the database. This includes performing DDL or DML changes while not affecting the applications.

So why is it that SoA is failing for us? It's fairly simple, we are trying to emulate our MySQL queries into a midtier application. So why is this bad? Easy, we aren't in the business of making a SQL implementation; nor are we in the business of making high performance SQL implementations. Why is this the case though? Well if you think about an SoA, many times you will need to build an XML or JSON output from the MySQL rows returned. And in some of these instances we need to also perform some other validation or extra data extrapolation. So when we pull down 30,000 rows our service layer might choke because of memory requirements. Well, how about if we just return the first 300. Well we need to sort. So this means that the sort will be done in the SQL but how do we get more than 300? What happens if we don't want to sort by the default sort? You can see how this quickly becomes a giant mess and you begin to create a DSL (Domain Specific Language) build from a DTD in order to do exactly what SQL has already done. But now what if you need to do a count? Oh there is no count? No, instead you need to perform your REST-ful get and then count every element. But how much data was processed in order to receive a single integer?

Let's look at the count(*) example. Our ms are is a simplistic example. We'll assume that our

Step	Time in ms
Read all elements into SQL memory	0.01
Send all elements to mid-tier	0.01
Sort data on mid-tier	0.02
Transform mid-tier results to XML	0.03
Send XML to client	0.02
Count all XML elements	0.03
Total	0.12

Now let's assume that the query is for an InnoDB engine on MySQL which means that a count(*) must count every row in the table.

Step	Time in ms
Read all elements counting as we go	0.01
Send result to client	0.01
Total	0.02

Clearly this is not effective and we should probably look at another example to see when it does work. Let's assume that our client wants all records (200) in a table containing tuples that are 120 bytes per record.

Step	Time in ms
Read all elements into SQL memory	0.01
Send all elements to mid-tier	0.01
Sort data on mid-tier	0.02
Transform mid-tier results to XML	0.03
Send XML to client	0.02
Total	0.09

Now let's look at this if we just queried the database directly.

Step	Time in ms
Read all elements into SQL memory	0.01
Send all elements to client	0.01
Total	0.02

Wow, looks like there is quite a large difference here. This is actually one of the many problems that we are facing attempting to do an SoA. We are also facing the issue of memory needs. For example, let's assume that we have our 200 records at 120 bytes per record. This comes out to 24000 bytes. Not really a huge deal here. But what happens if we have 100 clients that are performing these same types of queries. Except each one is asking to search by a different variable. This means that caching gets us nothing. So let's add another 100 clients. So now we're at 2.4MB of data that is being served. This of course doesn't include the extra XML overhead that needs to exist to send to the clients. It doesn't include the database handles etc... But we can see that if we increase the number of records from 200 records to let's say 500 records; we can increase the storage needed to 6MB. As we continue forward, given a steady rate of record creation we can see that we will eventually run out of space as we try to cache our records while trying to provide fast access to data.

Our issues stem from these exact issues; we've seen our service layer choke and die due to client load. Hundreds of queries per second looking for hundreds of records per second cause these services to choke. Not because the services are written poorly; but because they are not designed for high performance like an SQL database such as Oracle or MySQL is. Now let's look at the decoupling aspect. So now we're going to look at some of the concepts that SOA is supposed to fix with connecting to a data storage directly.

Many proponents of SOA say that the biggest thing you get is the decoupling of the data storage from the application. But let me ask you this; how often are you changing your data store? And even if you are, let me ask isn't this the point of JDBC, to abstract the SQL connection itself? Not to mention that if you do, you'll be updating your service layer anyway. So you're not actually saving anything here. Well, when you make DML changes let me ask, what do you need to do? Update the application layer to take advantage of the new fields. If you have an SOA, what do you need to do? Update the Service layer to take advantage of the new fields. But in the case of the service layer you can provide default field values such that the application doesn't need to be updated on DDL deployment. But what if we used default fields in the database? Don't we end up with the same functionality?

Now here's where we get into where, in my opinion, SOA really makes sense. The SOA can do checks in the data store in order to ensure that the data is. This makes complete sense right? We can ensure that our data is always valid. Unfortunately this same functionality exists in a database; this is the purpose of things like domains and triggers. Again, this functionality already exists in the data layer and is designed to perform faster than anything we can program in. But what about being able to combine tables together and show data in a concise manner while pulling underlying tables out from eachother. An example is maybe we have a customer table, and it contains some information that we want to bring into another table. If we implement SOA, we can just query both tables and return a data type that continues to look like our original table. But data stores have this taken care of as well; they are called views and they allow this same type of functionality.

So what is a better architecture? Clearly connecting directly to the database is not a good option in all cases. More specifically you don't want to have a Web UI that is doing manual database queries from some AJAX client. While that might be true; what you really want to do, is to create a library that contains all of your functions that perform the appropriate database queries. All of your applications (including your web-app) should include the library and make the calls directly. Note, when I say web-app I am talking only about the back-end of the web-app not the front-end HTML/JavaScript portion of the web UI. Now at this point, you can take that library and wrap it into its own service layer to allow external clients to perform queries as they see necessary. The idea is that you don't decrease performance and UX for your application (and subsequently your clients) but provide some functionality externally to other clients that may want to roll their own UX. This provides the high performance required since you have the functions that connect directly to your data storage and yet you have the ability to expose whichever functions you want externally. This also means that you can update the service layer at the same time your application is updated. So where do AJAX function calls exist in here? Well this is where your SOA is necessary; you need the SOA to allow for external connections.

The idea is simple, you treat data requests as either internal or external. If they are internal requests; then you include your library and perform the query through the built-in functions. If they are external requests, then you only allow a query through the SOA. Here are some examples ofeach type.

Internal

In-house application
Service layer itself

External

3rd party internal applications
Clients who want an API

I know it seems counter intuitive, but sometimes we have to step back and ask why a specific architecture is better; not just assume that it's better because it "seems cool" or "because everyone else is going to it."

Thursday, January 12, 2012

Heaphone Woes

One of my favorite things to do while coding is listen to music. Most likely this is a topic I will cover in a later post; but for the time being just note that it's filled with almost everything. Regardless, I've fallen into a very bad situation with my current headphones. After lasting for 5 years; my pair of Bose over the ear headphones have now fallen apart. The rubber surround has completely fallen apart and I'm left with headphones that look like they've gone through a war with the surround exposing the foam composing the surround. Needless to say I need to replace my headphones at this point.

Of course I'm very picky about my headphones and how my music sounds; I play guitar and am very picky about my tonality being exactly what either I want it to be or as close to the original recording as possible. Most audiophiles like to have their music reproduced in specific ways; as close to possible with flattening equalizers and wide frequency responses etc... Me on the other hand, and I think one of the things that makes me a fun musician is that, I just want it to sound good. I don't really care that when the record was recorded, the treble was completely even with with the bass on this record. Instead, I just want it to be a very enjoyable experience; I like completely zoning out to my music (both when I'm playing and when I'm coding) and just letting my mind be solely encompassed.

Of course this doesn't work so well as my headphones die; as such, it's time to buy some new ones! At home, I have a pair of Beats by Dr. Dre Studio.

That's $300 for a pair of damn headphones, which is definitely expensive, but until you actually hear these things you will not understand just how well they take you to your own little happy place. The Bose headphones I previous had were about $100 and the price has now increased to $150. The problem is that; I'm not sure if I want to spend that amount on another set of headphones that will last for 5 years.

So, what other headphone choices to I have? The thing that I've seen takeover are the earbuds; but I'm never a huge fan of earbuds. They are always uncomfortable in my ear and it's not easy to take them out as opposed to headphones which let me go back into reality at work. Now there are other headphone makers such as Sony who make a decent pair of headphones. The headphones I had for my entire college career and into my professional career before I got the Bose were some simple $15 behind the neck headphones which were fabulous. They worked, they were easy to take one ear off and were also fairly comfortable. They also had a pretty good bass response. For the $15 you really couldn't beat them! That plus they fit into my backpack and I could crunch them and they just stayed alive for years.

Now the question that I have is if I buy the $150 headphones that I know will last (at max because I don't believe quality has gone up) 5 years at a maximum; which means that my headphones will cost about $30 per year. But if I were to go with the behind the neck headphones; I can actually replace them twice a year and still end up with the same amount of money spent over time as the one time cost of the Bose headphones. So the big question is; are the behind the neck as good as the Bose? My answer is no, the Bose are fantastic at blocking out all the noise around me (even though they are not noise cancelling).

Now Sony does have some other headphones that are closer to what I'm looking for (over the ear) which also include noise cancelling. I've tried the on ear version of Sony's noise cancelling but have been quite underwhelmed by the overall sound quality.

So I'm not sure if I'm going to end up with the same quality as the on-ear headphones. So what about Sennheisers? Well, I would be all over Sennheisers if I was running a music studio and needed to hear exactly what instruments were putting out and I wanted absolutely no equalization in my headphones.

I've been seeing Skullcandy but I have yet to actually explore them. My first thought is that I dislike them; their over ear headphones have a triangular look and feel to them. I've tried them on, and it always feels really odd on my head; so I'm really not thinking that I would be comfortable with them for an extended period of time. So this leaves me with the dilemma, do I want to buy some Sony's that may or may not underwhelm me with sound; some cheap Sony's that I will be replacing at some point in the near future. Or go buy the exact same type I have now with an increased price? Or maybe I spring for an extra $50 and get some on-ear Beats that I know will sound good even though I'm not a huge fan of on-ear headphones.

EDIT 1/16/2012:
Broke down and bought a pair of the Beats Solo (HD) headphones. Had a great sound quality, although no noise cancelling and not over the ear does a good job of drowning out the background noise. Also, the on ear is not that uncomfortable; and are actually pretty small and come in a nice carrying case. Looking forward to another 5+ years of headphone usage.

Thursday, October 20, 2011

Computer Science = Math

Today when I was at work, I realized that I needed to draw a string at the mid-point of a 2d line. So of course, I go and google mid-point of a line and get the equation below.

Once I had this formula I plugged it in; but of course I wanted it in a different place. So, what if I wanted to place it at the quarter mark? So I searched but found nothing. Now it was up to me to figure out the mathematics behind it. Well, it's pretty simple right? Since X2 is the X for the end point; if we plug-in the previous equation into X2 clearly we can get the midpoint of the midpoint. This is as shown below.

Of course, this is a nightmare to look at; we need to simplify the equation. We'll be looking at this for only the x parameter since I don't feel like showing both x and y since they are pretty much the same. The first thing to do is to split the equation a bit so that we can begin to simplify.

Now, we bring down the fraction on top of the second fraction and combine the two denominators.

Now we need to find a common denominator (easy to see is 4) so we need to look at the first fraction and get the denominator to 4. To do this we can multiple the denominator by 2 (in order to do this we must multiple by 1 so this means we have to multiple the whole thing by (2/2))

Now we combine the two fractions which leaves us with the single fraction below.

At this point we just combine the parameters to end up with the final equation for finding the mid-point of a mid-point.

Now here is the thing; what if I wanted to get the midpoint of the midpoint of the midpoint? Well we start off with the main function.

Again, we split the fraction in half, bring down the second fraction and find a common denominator.

Now that we have a common denominator, we can go ahead and combine terms to get the final fraction.

So here was the coolest part, do you see that there is a pattern? Below you see the final fraction; n is the number of divisions you want of the line. 1 is the point at 1/2 the line; 2 is the point at 1/4 the line; etc...

So what does all of this mean? Why am I bringing math into this blog? Because sometimes we as computer scientists need to be reminded that mathematics is important in our day to day lives. I've met way to many developers who consider themselves programmers; however, if there isn't already an equation to express something that they need they just kind of assume that because no one has done it then it's not a worthwhile thing to do. Either this, or I'll see them write some really terrible way to perform this.

So for our example above, the fastest way to figure out how to get to the point of the line is to figure out the m (rise/run) and just go from point A to what they think should be point B. Then call it good enough. But if you're just doing that, maybe you need to rethink what you are actually doing.

Another thing, mathematics supports recursion in every way. Algorithms much simpler to express recursively; however, most programmers will stray away from recursive algorithms. As a matter of fact, the main complaint that I hear is that they are just too difficult and "make my brain hurt" to think of how to implement them. Maybe I'll write a blog about recursive algorithms; just to show how easy they are to actually write. The main reason that they were never better than iteration is that they required "calls" into methods/functions. However, in many newer languages; recursion is supported and is actually much better and stabler than iteration. This is usually due to values vs. variables which I should cover again at a later date.

My point is that you should never actually be scared of algorithms or doing mathematical work. If you do, you are narrowing yourself down to a typist that is paid to implement other peoples ideas. As functional programming becomes a larger paradigm, it will become important to learn how to actually work around in that paradigm. If you don't believe me; go checkout erlang and try to write a very simple application. Not a "hello, world" but instead what about something with an accumulator? What about a general for loop? You'll quickly see that recursion is an important part of functional programming life.

Monday, July 25, 2011

(*function) pointers

Ok, now that we've covered some really cool things like pointers and double pointers; it's time to step things up a notch and look at function pointers. So let's start off with the two main things that come along with learning function pointers; what is a function pointer, and why do I care about a function pointer... So what IS a function pointer? Well let's think back to what exactly a pointer IS. A pointer is a storage of a memory location (remember from the previous post Pointers with *, &, ->, . ).

Now that we're all acclimated back into the world of pointers; or some are more confused now let's proceed! So the thing we need to keep in mind is how applications work. I'm really not going to get into all the theory that goes into program counters and heaps/stacks because it's really boring and actually causes more confusion than is necessary (yes, when working in pointer~land a certain level of confusion is ALWAYS necessary). I'm also not going to get debugging, as such I'm going to lay down some B.S. that looks cool but still makes sense.

So, functions exist as a point in memory as the start of an execution scope; I know what you're thinking, "what in the hell does that mean?" Well it's pretty straight forward, every function is a separate scope; this means that the variables that exist in function A are not available in function B (unless of course we pass them between each other). As such, there is always an entry point of the function, if you are familiar with gdb then you can see this happen with the "call" OP. If you didn't understand what that means don't worry it's not a huge deal for you; just know that there is a single point where the function starts, THIS is the pointer of the function. So let's assume this:

0x01 start function
0x02 do some work
0x03 do some more work
0x04 do even more work
0x05 return

So when we define a function pointer, it will take the memory address (remember points contain memory addresses) of the start of the function. From our example above, it will contain 0x01. So the question is, how does this actually work? Well, technically speaking it doesn't; OK you're a piece of shit, why are you telling me things that don't work!? Hold on, calm down; it's not that it doesn't work, it's that the compiler MAKES it work. See the compiler knows when you are using a function pointer; as such, the compiler knows to perform the function execution. It's not really imperative to know how this works, but more important that you know that the compiler understands that when you call the function pointer it is essentially the same as calling a normal function.

Ok, so tons of talking about what function pointers are, but how are function pointers defined? Well, it's pretty straight forward; the main thing you need to remember is that (*) is the definition of a function pointer; the rest looks like a normal function. Huh? That makes no sense; sure it does! Let's look at a function pointer, the function is defined as:

void function(int i, void *ptr);

So what does the function pointer look like? The name of our variable is 'var'

void (*var)(int, void *);

That's it? Yea, not that bad at all right? So how do I need to accept these types of pointers into another function? I mean, what happens if I wanted to accept a function that I wanted to call causing the execution to be determined at run time? The variable name of the function pointer is actually 'func'.

void my_function(void (*func)(int , void *));

Alright, now we're cooking with gas! So, here's the next question; how do I execute this damned thing!? Well there are two ways, there is the syntactically correct way to do this. Let's assume that we have an integer i, and of course our great friend NULL (although this could be any pointer). We'll assume we are executing the function 'ex' from our first example.

(*ex)(i, NULL);

So I mentioned that this was the syntactically correct way to execute the function pointer; is there a different way? Yes, actually there is! You can execute function pointers just like you would a normal function. WHAT? Why the hell would you teach me this crazy ass syntax!? Because, if I didn't you would see it somewhere and shit a brick trying to understand what in the hell it actually meant. We assume we have i as an integer again and now we're executing 'func'

func(i, NULL);

So much easier to understand! Yes, it is. So now that we understand the syntax; let's cover a few examples of why anyone would want to be able to do a function pointer. Well if you think about arrays, or linked lists (if you don't know about linked lists, just relate back to the array idea and stick with it). So, what happens if you want to perform a sorting routine (yes, I do realize that everyone and their brother has done this but hey; it assists in understanding the concept so calm down and listen up) but you want it to be flexible for every possible sorting style you wanted. What does this mean? Well let's say that I have a list of structs, I have a specific way I want to sort this array. Well, what happens if I want to sort it differently later; do I write a whole new function for sorting in this same method? NO! We use function pointers!

By using function pointers, we can pass a function pointer to the sorting routine that will be called which will be called on each element and will assist in sorting the elements of the array. What does this mean? Let's say we have a function definition:

void sort_my_shit(MyStruct *arr, int len, 
  int (*compare)(MyStruct *s1, MyStruct *s2));

That's freakin' AWESOME! But, I hate this syntax; is there anything easier to write? Of course, if you remember in C; there is this thing called typedef's. So let's look at our example that we've been checking out but with a typedef.

typedef void (*MyFunctionPointer)(int , void *);

What the hell, you just repeated that thing from the first example with the 'ex' function pointer! You are correct, but I did add the typedef to the beginning. Really this is it? How would I use this to define a function pointer? It's pretty simple, at this point you just use 'MyFunctionPointer' just like a normal pointer.

MyFunctionPointer our_function;

WHAT? Are you freakin' serious!? You've been making type in all this junk before when I could've just typed in one line and made life so much easier!? Yes, again; just trying to get you used to the syntax; because assholes like me like writing out long hand the function pointer definitions rather than the typedefs because I hate seeing a million different type names residing everywhere. Get used to it... How do you accept the function pointer?

void newFunction(MyFunctionPointer our_function);

OH MY GOD YOU ARE AN ASS!! Yes, I am; I've been showing you tons of difficult syntax but guess what; when you stumble across the long hand syntax you'll be glad that I brought you into it because now it makes a little more sense to you. If you think about the command/strategy/state patterns of OOP, you can actually implement these types of patterns THROUGH function pointers. Maybe I'll cover that in my next post; but for the time being check it out, you can start to make partial classes through utilization of function pointers.

Wait, I can make C more objective? Yes! Isn't that awesome; the only problem is that you have to do some work to get it to be more objective. Check this struct out.

typedef struct MyClass {
  void (*func)(struct MyClass *this, int var);
} MyClass;

Now you call the method (assuming that our variable for MyClass is cl and an integer i).

cl->func(cl, i);

AHHHHHHHHHH, HEAD 'SPLODE!!!!!! Calm down; this is one of those things that people freak out about; just think about this; if you actually do a gdb on a C++ application and break during the method of a class you will actually see the method being almost the SAME thing where this is actually a method parameter (that is automatically passed in by the compiler during execution). Pretty interesting right? I thought so; just remember, pointers are NOT that difficult to understand; that are hard to master.

Code Monkey

My Pages