Brandon Werner

Archive for the ‘Java’ Category

Seattle Heat, Software Transactional Memory, Cloud Computing Storage, and me

Saturday, August 16th, 2008

Here in Seattle we are in the middle of a mini heat-wave, which usually means to our sensitive and water worn skin that the temperature is above 80 degrees. However, we’ve run in to a stretch of 90 degree weather these last few days - and since Seattle seems proud of its lack of air conditioning it’s much worse here than other places that get that hot regularly. In Cincinnati you would go from air conditioned homes to air conditioned cars to air conditioned office buildings. Here that isn’t the case. The situation is slightly worse in Redmond since the farther away from the Puget Sound you get the hotter the air. The best you can do is go out for the day, as my condo and most houses don’t have any air conditioning at all. Worse, all but the wimpiest indoor air conditioners are banned due to decree from the Home Owners Association, which I have come to suspect is a front for the Communist party. After all, the McCarthy trials were happening about the same time suburbs were getting built in the 1950s. Where were the poor communist to hide?

I’m not saying, I’m just saying.

Summer Reading While The Boss Is Away

If you are elsewhere in the country in late August, chances are you are acting like Seattle folks do the rest of the year - taking refuge inside and away from the weather outside. While you are in your air conditioned home waiting for the last and hottest part of summer to pass, why not do some good reading to prepare for the fall when everyone returns from vacation and you get back to the serious business of deadlines, programming and of course geeky arguments about the topics of the day. Here is a good reading list to bookmark.

Get up to speed of Generic Programming, or Programming In General

My first recommendation is the collected papers of Alexander Stepanov, which you can get from his website entitled… Collected Papers of Alexander Stepanov. For those who don’t know, Stepanov is the key person behind the C++ Standard Template Library, which he started to develop around 1993. He had earlier been working for Bell Labs close to Andrew Koenig and tried to convince Bjarne Stroustrup to introduce something like Ada Generics in C++. His papers are a treasure of thought on generic programming, logic, robotics and anything else that made you turn to the Computer Science page in your university’s catalog. Best of all he also provides slides for his book in progress, written with Paul McJones, called Programming Elements. This is a great book for refreshing your knowledge of abstract and concrete concepts in quick and easy powerpoint format. Just take a look at the table of contents and I dare you not to click on at least one of the Chapter links. Don’t worry, I won’t tell.

The “Core” Debate of the Community: Concurrent Programming and Software Transaction Memory

Yes, the pun was bad. It does however illustrate one of the facets of the problem that is burning up academic and commercial researchers alike, and responsible for a large amount of papers flooding the ACM portal: Software Transactional Memory (STM). Well, actually, that’s a possible answer to the problem - not the problem itself. They are often confused now. The problem is that since Intel and AMD have decided to start introducing more cores on to single chip we have to deal with the big problem that comes along with that: managing the threads of multiple cores trying to do the same work on behalf of the system it’s working for. It also scales in to bigger problems of any type of work you may want to farm off to “locales” that may need to cross boundaries and work on the same data within a transaction (for more information on some of this, see my post from the Google Scalability Conference regarding Cray’s work to replace MPI with a new concurrent language Chapel and the GIGA+ filesystem below)

I think Simon Peyton Jones from Microsoft Research in Cambridge illustrates it best in his paper Composable Memory Transactions(PPOPP’05) :

The dominant programming technique is based on locks, an approach that is simple and direct, but that simply does not scale with program size and complexity. To ensure correctness, programmers must identify which operations conflict; to ensure liveness, they must avoid introducing deadlock; to ensure good performance, they must balance the granularity at which locking is performed against the costs of fine-grain locking. Perhaps the most fundamental objection, though, is that lock-based programs do not compose: correct fragments may fail when combined. For example, consider a hash table with thread-safe insert and delete operations. Now suppose that we want to delete one item A from table t1, and insert it into table t2; but the intermediate state (in which neither table contains the item) must not be visible to other threads. Unless the implementor of the hash table anticipates this need, there is simply no way to satisfy this requirement. Even if she does, all she can do is expose methods such as LockTable and UnlockTable – but as well as breaking the hash-table abstraction, they invite lock-induced deadlock, depending on the order in which the client takes the locks, or race conditions if the client forgets. Yet more complexity is required if the client wants to await the presence of A in t1, but this blocking behaviour must not lock the table (else A cannot be inserted). In short, operations that are individually correct (insert, delete) cannot be composed in to larger correct operations.

The most that has come out of this is that we know it’s a problem and we’d love to use the keyword “atomic” to wrap our transactional code in our languages. Beyond that, it’s a lot of hand waiving and Powerpoint slides. Some people though are actually trying to work it out. The best starting point here are the papers from the before mentioned researcher Simon Peyton Jones. His collection of papers on STM offers a good starting point of the problem and what some possible solutions are. In his papers he uses Haskell, and his work has led to Concurrent Haskell. Haskell lends itself to STM for reasons I won’t go in to here, but it will be quite a bit more of a challenge to get the same functionality in Java and C#, but there is already an API for C# Software Transactional Memory from Microsoft Research you may want to explore.

If you don’t care about this, just don’t go naming classes atomic and you should be fine.

Storing The Cloud: How Do We Scale?

Solid State (read: Flash) drives aren’t the only thing showing the age of our old file system technologies. As we expose software as services and begin taking on large numbers of tenants for our software, cloud computing needs clusters with thousands of nodes that, with the multi-core technology mentioned above, will impose a challenge for storage systems. We will need the ability to scale to handle data generated by applications executing in parallel in tens of thousands of threads. There have been some solutions posed, such as IBM General Parallel File System (GPFS) and Microsoft Research’s Boxwood technology.

I was lucky enough to watch a presentation on GIGA+, another solution that is being researched by Swapnil V. Patil at Carnegie Mellon University. One of its neatest ideas is leaving the header-table behind, using a bitmap instead. I got to sit down with him afterward and talk about the challenges we face in this space. It was a great time. His primary concern about GPFS and Boxwood is the use of hashing and B-trees, which causes the possibility of bottlenecks and synchronization issues. By using a bitmap, and keeping it small so that it can be shared across nodes easily, GIGA+ eliminates a need for “metanodes” or other controllers on the HPC storage architecture.

His paper, GIGA+ : Scalable Directories for Shared File Systems, is a great read for those interested both in the problem of high-performance computing and storage. Their work seeks to maintain the UNIX file structure however, so those who care about scaling Microsoft infrastructure may find less to enjoy, but the overall architecture and problems outlined in the paper is applicable to any massively large storage cluster technology.

Enough Already

That should be enough to get you through August. When your boss comes back from his Alaskan cruise, nothing will ensure he leaves you alone more than talking about Concurrent Haskell or how much you enjoyed Chapter 9 of Programming Elements: Algorithms on increasing ranges. Enjoy the air conditioning you lucky bums.

Thoughts On Google’s Conference on Scalability In Seattle

Monday, June 16th, 2008

Google Scalability Conference LogoIf you are looking for a good collection of notes regarding the topics covered at the Seattle Conference on Scalability, you can do no better than what James Hamilton put together. Instead, I’ll write a quick commentary on what I experienced.

Scalability Is Your Problem Too

The goals of the conference are laudable. Scalability is an issue that almost all practitioners of software engineering face, especially as we move towards offering services both inside and outside the enterprise. Many are taken off guard by the sudden issues that confront them after wiring up a large scale services-based environment; especially around distributing load, distributing the data, and writing the data quickly. Sadly, I didn’t see too many people from large companies there - most were software companies like Microsoft, Google, MySpace and Amazon.com. The attendance may be a consequence of the subject matter. This was some intense stuff dealing with MPI at Cray and its hopeful successor, Wikipedia redone with DHT and Erlang, a b-tree vs. Hashmap debate and scalable storage issues when dealing with billions of files. A more fun loving person would have done better going over to Adobe and hanging out at BarCampSeattle, which was going on at the same time.

Despite the intimidating material, there are real architectural and design issues that these discussions present that should be in the mind of anyone dealing with large datacenters that scale globally or even nationally. The approach of GIGA+ file storage, maidsafe’s new computer architecture, and NetWorkSpaces for the R language was uniform: off-loading responsibility for management of data (meta or otherwise) to all vertices in the deployment graph instead of a central repository. NetWorkSpaces in R and maidsafe even discussed computational scalability - while Cray’s new Chapel language and the discussion around Software Transactional Memory focused on scalability across processing cores as well as machines.

GIGA+ Bitmap Example

GIGA+’s approach of maintaining a small bitmap file on each node and passing that around - while anticipating and accepting stale data on a few edge nodes - was brilliant in the patterns it hinted at, including that perhaps being right all the time isn’t as important as being fast. You can be right most of the time and accept the performance hit of not being right some of the time. There are many people who would cringe at this, but at this point we’re going to have to play loose and leave a few balls up in the air as we juggle - doing the math of how often one may fall while keeping the rest going as fast as we can.

Pay No Attention To The Man Behind The Curtain

Yet if I had to sum up the content of the conference I would say it was big on strategy and architecture but short on implementation. There was a lot of things hinted at “behind the curtain” but nothing assured hand raising from the compsci geeks in the room more than hand waving when you got to the distributed piece of your solution. For instance, one of the big benefits of Chapel - the MPI successor that Bratford Chamberlain of Cray presented - was that you could have distributed arrays and graphs that would be automatically sliced up to be distributed to parallel cores or even other “locales” if desired. How the language determines where to split these large arrays and graphs and farm them out was not discussed. One of the more interesting slides was dashed lines drawn across various nodes and vertices of a graph symbolizing how it would be chopped and distributed. Someone in the audience raised their hand at this - but he moved on and the hand went back down. To be fair, Chapel was called a “multi-resolution” language where one could start fairly abstract and then add more detail and control to get the best desired result - something I assume you have to do to get good or intelligent chopping and distribution of the data. Given that one of his slides was a comparison of code lines between Fortan using MPI and Chapel: seeing a working code snippet of Chapel would have been helpful. It may turn out to be the same amount of work after you get past the “global view”.

This was the trend though, as all of the presentations had a bit of hand waving regarding performance metrics and distribution of computation. This was highlighted by the talk of Vijay Menon of Google - whose work at Intel I was familiar with - discussing Software Transactional Memory. He illustrated the challenges of implementing this in an imperative language (I’m suspicious you can even do STM well in an imperative language with state - as I discussed before) but beyond suggesting the keyword “atomic” to replace “synchronized” in the Java language there was very little real content discussed for those already familiar with the issue of locks and multiprocessors. Concurrent Haskell wasn’t even mentioned. A better introduction and discussion is to be had by watching the O’Reily’s OSCON video from Simon Peyton-Jones (the writer of GHC and now at Microsoft Research) on the subject. After that, if you’re still hungry, his collection of papers on his Microsoft Research site is a delight.

Of course the point of these conferences is the discussions that occur during the breaks and in the networking event afterwards - something that I treasure having newly moved to the Seattle area from Cincinnati. Instead of just observing and blogging from afar - I get to be at the same table as Vijay Menon, Thorsten Schuett, Swapnil Patil, Paul Watson and others.

Summary of the Architectural Patterns I Saw

If I had to summarize what I took away from the conference from a high-level architectural stand-point, here are they are:

  • Every node must be aware of the state of every other node without a centralized controller.
  • To do this, a mechanism should be in place to share state quickly but peer-to-peer.
  • It’s ok to let some nodes go stale.
  • Client/Server is now one thing. Pub/Sub with computation. Every node on the graph should do work.
  • As much as possible, each node should maintain its own security and state. You should be able to have anonymous resources appear in your data center and be put to use without much configuration.
  • As much as possible, abstract the distribution of processing away from programmers.
  • Key,Value with Hashes are best for scalability and distribution (it seems to have won out in all the solutions presented here.) Blame MapReduce.
  • Ants can be used to demonstrate anything.

I hope everyone had a good of a time as I did.

ACM Article: How Intuitive is Object Oriented Design?

Saturday, May 17th, 2008

There is an incredible article that was published in the Communications of the ACM entitled “How Intuitive is Object Oriented Design?” by Irit Hadar from the University of Haifa, Israel and Uri Leron from the Israeli Institute of Technology.

It goes through the process of examining the disconnect between intuition and OO design for engineers and software designers.

The object-oriented programming paradigm was created partly to deal with the ever-increasing complexity of software systems. The idea was to exploit the human mind’s natural capabilities for thinking about the world in terms of objects and classes, thus recruiting our intuitive powers for building formal software systems. Indeed, it has commonly been assumed that the intuitive and formal systems of objects and classes are similar and that fluency in the former helps one deal efficiently with the latter. However, recent studies show that object-oriented programming is quite difficult to learn and practice. In this article, we document several such difficulties in the context of experts participating in workshops on object-oriented design (OOD). We use recent research from cognitive psychology to trace the sources of these difficulties to a clash between the intuitive and analytical modes of thinking.

It is currently hidden behind the ACM referred library portal but if you are an ACM member you can access it here.

Fun Lunch Time Distraction: Calculate your googleshare influence

Wednesday, April 30th, 2008

Slurping on some delicious vegetarian chili from the lobby of Safeco Plaza in Seattle, I was just introduced to Googleshare from Atomiq by The Google Operating System post on The Informational Distance Between Cities.

Essentially, it is the (number of search results that contain your name inside search result x / number of total results in search result x = your googleshare (e.g. mindshare))

So, for something you might be known for, like “Java”.. here are the results when ran against my name:

“Brandon Werner” inside “Java” search results = 1,350 (link)
(divided by)
“Java” search results total = 416,000,000 (link)

or

1350 / 416000000 = 3.25.

So my mindshare of the Java market is 3.25%. Not bad at all.

Now, some more mindshare results for you:

Brandon Werner/IBM = 4.17%
Brandon Werner/Semantic Web = 3.85%
Brandon Werner/Insurance = 9.20%

So, I guess I’m kind of hitting my mindshare targets! So what about you? Why not figure out your googleshare and Disqus below?

Service Data Objects Architecture: Business Objects with Smarts Presentation

Thursday, April 17th, 2008

This is a presentation I created to describe how SDOs can be used in the Insurance enterprise space to provide sanity in the large and diverse messages. These are increasingly being passed around as Business Objects in a Domain architecture as companies move their old object patterns to a service based approach (I refer to it as servitized business objects).

If you are looking for my particular experience on how SDOs and the IBM EMF framework that contains them works against the large ACORD schema, you can find my critique of Websphere Process Server and ACORD here and the SDO design pattern plugin I wrote for Rational Software Architect here.

You can download the slideshow here.

Java Will Never Be A Mobile Platform - Thoughts On Nokia, iPhone, and it’s Murderer

Sunday, April 13th, 2008

Rick Ross points to some amazingly bad conclusions from Nokia on their supposed “iPhone killer”, the Tube, and then throws his own in for good measure,

Speaking at a recent conference, Forum Nokia VP Tom Libretto, confirmed that the Finnish telecom giant has a new touchscreen device in the pipeline, tentatively called “Tube” (details and pics at symbian-freak.com.) In marked contrast to Apple, it appears that Nokia is embracing standards, and the Nokia Tube will likely have Java inside from the start!

Speaking about the Apple competition, Libretto had this to say about 6 million iPhones having been shipped so far, “We’ve done that [volume] since we’ve had dinner on Friday.”

The problem with all of this is that it shows Ross, Nokia and the mobile device industry in general still haven’t managed to figure out what makes Apple and it’s new mobile touch platform (which includes the iPhone, the iPod Touch and we can only assume more to come) so compelling. It’s difficult to watch an industry that is slow to learn from a game changing competitor, as I experienced in the insurance industry when Progressive came on the scene in 2000 with it’s direct to consumer model. For that topic though there is a massive amount of business literature, just Google “Moved” and “Cheese”

I will also let slide Nokia’s attempts to compare iPhone numbers to Nokia numbers in the article, without even mentioning that those numbers would have to be all shipping Nokia phones, not just smart phones and certainly not just the “Tube” prototype as it has yet to be released. I doubt we’ll see this new Java experience, if they even follow through with it, on the cheap plastic candy bar phones Nokia still sells in the millions.

For this entry, I’ll go Daring Fireball on Ross’ allegation summed up politely in his sentence: “In marked contrast to Apple, it appears that Nokia is embracing standards, and the Nokia Tube will likely have Java inside from the start!”

The False Java Test

First of all, why do people still believe Java on a mobile platform, or any consumer facing platform for that matter, is essential? Not only could you point to the numerous failures of the “write once, run anywhere” approach from both a technical and historical perspective, this litmus test - shouted whenever a distributed platform emerges on the market - is never a predictor of a device or business’s success to both the consumer and the developer. In fact, it sometimes hurts it. More on that later.

The False Standards Argument

Second, Ross indicates that Nokia will be a better platform versus iPhone because of it’s acceptance of the current Java standards. However, its obvious application code executing in a runtime through a mobile phone currently cannot scale visually. Can you imagine a MIPS or JME Jar that has 8-bit graphics designed for a non-touch postage stamp sized screen running on an iPhone? This is an important argument, because Ross indicates that with “standards” and “Java” we’ve won something here. We haven’t. All those “standards” compliant mobile apps currently out there wouldn’t even be worth the download to the Nokia “iPhone killer”.

Think on this: we’re talking about a new GUI layer that a developer would have to write against to even begin to approach the smoothness of the iPhone. This would be about as “new” as writing to the iPhone SDK for a developer. The benefits of these supposed standards would win us - ease of development and using existing developer knowledge of the code and runtime - wouldn’t work.

You could counter with JavaFX, which Sun took great pains in 2007 to show off with a UI that looks dead on the iPhone in it’s own attempt at an iPhone killer. Yet have you seen this running anywhere besides on that JavaOne stage in 2007? Sun released an API for JavaPhone, but it seems to have went nowhere - as embarrassingly indicated by the fact that the graphics on the Overview page don’t even load.

You may also counter with Google’s Android. However Android is JINO (Java In Name Only) and can’t claim the use of any standards from the mobile industry or Java. It has it’s own runtime and compiler. You could argue you did get the Java developer base here, but that just feeds in to my third argument below. I agree with Google/Danger’s approach though, since I believe that to program in the mobile space you need, to borrow a buzz word, a Domain Specific Language with constraints and customizations to the device that you are targeting.

The False Write Once Run Anywhere Argument - Now Context Is Key

Casio with Windows CE Start Menu ExampleThis carries me in to my third point that both Nokia and Ross miss - you do have to think differently when writing an application for a mobile platform. Carrying over Java developers from the desktop environment (or more likely and worse - the server environment) to the mobile platform with the same paradigms is a recepie for disaster, if just because they will demand and create the same experience on the mobile platform. (Think Windows CE circa 2000 when it had the Start button in the same place as the desktop operating system).

This is evidenced by the predictable cries from these very same developers when it came out that the iPhone SDK allowed for no access to background threads. The need for this paradigm change is real and was expressed very elegantly by Craig Hockenberry of IconFactory fame in his blog entry Brain Surgeons,

Twitterrific on the iPhone could definitely make use of a background process to gather new tweets. In fact, a prototype version of the software did just that. And it was a huge design failure: after doing XML queries every 5 minutes, the phone’s battery was almost dead after 4 hours. In fact, the first thing I said after giving Gruber this test version was “don’t use auto-refresh.”

The heart of the problem are the radios. Both the EDGE and Wi-Fi transceivers have significant power requirements. Whenever that hardware is on, your battery life is going to suck. My 5 minute refresh kept the hardware on and used up a lot of precious power…

…It takes several months of actual iPhone development before you eventually realize that the iPhone requires a completely different mindset. Until that happens, you’ll make assumptions based on desktop experience, and that in turn will lead to a lot of bad designs.

Java - even in runtime philosophy - looking out over the horizon - is poorly matched for the new environment in which we find ourselves. You can even extend this argument to any software not written to a platform’s context, as Gartner recently wrote in regards to Windows Vista and the problems of it’s monolithic OS approach. They want Windows kernels targeted to what is essentially contexts. Add to this argument the activity going on in the mobile/UMPC space and this argument against both Sun’s Java and Microsoft’s Windows/.Net approach becomes even more compelling. No one can play with a Sony UMPC running Windows Vista at your local Best Buy and come away arguing that it fits in to the user experience and context.

I don’t hold out a lot of hope of a iPhone-esque compelling experience from any developer writing in the JSE or JME space on this “iPhone killer”. I highly doubt anyone would look at low level API access of UIKit and Foundation that allows for incredible visual control over an application, and then look at what history would tell us Java would come up with, and not pick Apple’s approach.

The Rise Of Functional Programming: F#/Scala/Haskell and the failing of Lisp

Sunday, January 13th, 2008

Over at Lambda The Ultimate, the best academic programming blog on earth, there is a large debate going on regarding what the future of languages will be for 2008. The most important thing to emerge from the discussion is the larger role functional programming will play. It seems like a safe bet. This year has seen the explosion of interest and creation of functional languages such as Apple OS X’s Nu, Java’s JVM using Scala and Microsoft Research’s .Net language F#.

I am ecstatic at this change.

The Failure Of Lisp

It’s hard to understand where it came from. Certainly one can argue the broader academic community had nothing to do with it, the old guard Common Lisp hackers are still as fickle and as judgmental to new comers as ever. Also, the old standards in Lisp languages, Franz and LispWorks have not lowered their prices to anything approachable to the casual developer. There are open source ANSI Lisp implementations without all the supporting engines and functionality, such as SBCL. In fact, my most linked thing I’ve ever written in my career is the installation walk-through I did for installing SBCL and Allegro which includes adding your repository and packages for CLOS and automatically compiling the FASL files, especially dealing with the asdf differences between the implementations. The complexity of this in itself points to problems with portability and configuration in Lisp. However, even that project that targeted Lisp’s Bread and Butter, the parsing of semantic ontologies for the Semantic Web, was met in the message boards with worries on if there would be enough developer participation using such an odd language, and recommendations on moving it to Java.

In reality, Common Lisp showed its failure as a community by sitting out this enthusiasm that has been generated around functional programming languages. It didn’t have to be that way. I recall my first awarenesses of functional programming’s growth was the awesome work of Lemonodor’s blog and Sriram Krishnan posting “Lisp Is Sin“. I was happy at the time that Lisp was getting such attention, as well as functional language architectures in general. I imagined that as OO languages had grown so verbose and feature dense that even the IDEs to develop your applications run in to the tens of gigabytes, a new evolution “Back To The Future” was inevitable. Even more, I believe long suffering Lisp deserves to be back in favor again, it’s certainly spent its time in purgetory. Yet, it didn’t happen. You can blame the old 50 year old men sitting on IRC channels for that. It was the most thorny and un-inspiring community I’ve ever participated in, despite my extreme interest in the language. It’s jaw dropping that a language with such promise has sat out the resurgence, and speaks to what an un-friendly and un-inviting community can do a technology platform. I would be the first to march it off to the grave.

The Rise Of Functional Languages

The interest in functional programming actually grew up around more academic but pure languages like Scheme and Haskell. Although these languages sit within their own island and lack many of the “dirty” aspects of Lisp’s CLOS environment that make it easy to access OS and hardware resources, they are still strikingly useful in learning things that are the staple of functional languages, such as Closures and Lambdas. Indeed, one could argue that the movement to move Closures in to OO languages (first C#, now Java) was in part due to the rise of awareness of functional languages.

Further, it seems to me that functional programming languages answered two prayers of those more ambitious engineers who don’t seem to want to stick with the script and Java worlds they were taught in college. Those two large wins, far more important than the semantic features of functional languages that have gotten all the attention, are architecture foundations of functional languages:

  • Referential Transparency / Side Effects
  • Concurrency

Referential Transparency

To those coming from a pure OO world, Referential Transparency and the restriction of side-effects can be something hard to get their heads around. The best way I describe this concept is by hitting at the root of their assumptions: Everything they deal with are dead. The objects are dead, the variables are dead, the entire atmosphere is dead, as if something had come along and killed everything in your stack and you have to assemble your program by only what’s been given to you, nothing more. There are no instances, objects do not “come alive” and have state; a state that you have to poke in to and a state that can change at any time. A function will always do what you expect, and nothing can come along and change that behavior.

One of the things that seems to appeal to developers most about the promise of SOA architectures happening in enterprise environments, if you’re smart enough to pry it out of them, is that they get the same referential transparency in services. No one can override a service (besides versioning, which is explicit to the developer) and a service will only return what it did earlier in your code and earlier in the year. This forces developers to design services that have the same relationship to the world as functional programmers write their functions for. This is perhaps the trickiest part of migrating enterprise teams to a services based model, their expectations of the mutableness of the services they are accessing and their inability to anticipate what working in that world will be like. Especially for those who use tools or libraries to convert service interaction in to an object, the interaction can be jarring.

However, the soon find the predictability and the safety of such an environment liberating. In much the same way OO programmers were use to making their objects or variables immutable to maintain their contracts and relationships with other objects, often sacrificing many of the benefits that OO programming promised their stack, now they have immutability and transparency in an environment where functional paradigms are key, they do not expect to be able to “embrace and extend” services. They are what they are. This tends to cascade out to the living instantiated code a developer writes as well, as there is no point in entering the world of the living if what you have to return to is a dead function.

This was hinted at in an article in the ACM Queue magazine by Terry Coatta, entitled “From Here to There, The SOA Way“. He states,

Objects are still a very good way to model systems and they function reasonably efficiently in the local context. But they don’t distribute well, particularly if one tries to use them in a naive way. A service-oriented architecture solves this problem by dealing with the latency issues up front. It does this by looking at the patterns of data access in a system and designing the service-layer interfaces to aggregate data in such a way as to optimize bandwidth, usage, and latency.

Not that SOA limitations are the only thing that is affecting the consciousness of a software engineer, the other issue is the large rise in the complexity of managing a large enterprise library written in an OO language. One of the largest pain points of any application of large size is the management of graphs and graphs of live objects and the living data within them. When software engineers experience the lack of side-effects in functional languages, it’s a breath of fresh air.

Concurrency

A funny thing happened on the way to those multi-core processors. People loaded their applications on them and noticed nothing got much faster, particularly when it came to transaction intensive tasks. Turns out Intel and AMD left out an important fact about their Moore’s Law cheating multi-core environment: you can’t ring as much performance out of it without changing the way you manage concurrency and threads. Sequential programming could always rely on going faster as the single processor speed got faster, but as multicores come in to play that isn’t always the case. You want to farm off transactions to occur on separate processors, and in the living world of mutable objects and variables, breaking out two transactions to work concurrently that operate on the same living data is a bad idea. Add structural programming’s solution to this problem, optimistic and pessimistic locking, and you have dead-locks in short order.

Functional programming has been a natural place to explore parallel processing and new ways of doing atomic transactions because of the reasons above. More important, these atomic structures can be composable which is lost when doing locks in structural programming. A lot of the buzz has been generated around the idea of software transactional memory, where execution blocks can be flagged and managed and built upon. The best introduction to this topic is the paper by Tim Harris entitled Concurrent Programming Without Locks. Although this use to be expressed only in the confines of Concurrent Haskell, others have shown how the same techniques can be used in other functional languages, such as F# using nothing more than PowerList.

This experimentation is one of the large reasons why functional languages have become more important as software engineers wrestle with the problems and promise of multi-core processors in transaction processing. Although not every engineer will be interested in the deeper details of STM or other strategies in concurrent programming, the fact that these libraries will emerge and only be available in the functional realm will force software engineers to learn the core concepts and bring even more visibility to the functional programming space.

Functional Hybrids: Functional Programming Is Now Approachable

The other driver for adoption of functional programming languages, besides the architectural benefits it has to solve current problems, is the fact that languages such as F# and Scala have adopted a more hybrid model in their language design, where a developer isn’t forced completely outside her comfort zone. Scala is a combination of functional and deeper OO methodologies (as in SmallTalk) and has access to the entire Java library, significantly reducing the learning curve. The same can be said for F# and .Net and Nu and Objective-C. This does have draw-backs however, as both F# and Scala have not been able to use more of the STM strategies that Concurrent Haskell allows because the underlying thread architecture of the VMs they run against are built for structural programming languages. It is easy to see how this can be fixed, however, and allow those using hybrid functional languages the same power as those who express their ideas in Haskell or even Lisp.

As I said, I am excited about this new resurgence in functional programming languages, and I am enthusiastic 2008 will have even more to offer those who are just getting their toes wet. I personally know some college freshman who started out using Nu as their first language, and are already contributing to the community. The future of software engineering is bright.

Apache Is SOA Ready

Monday, September 10th, 2007

You have to be impressed with Apache’s strategy and execution. Although many enterprise companies have stumbled or confused customers with their numerous SOA offerings in the marketplace, Apache has, behind the scenes, been dutifully executing on the core standards (WS-BPEL 2.0, SDO specification 2.1, SCA specification 1.0) that make up a SOA and plugging them neatly in to an overall stack. All of this has been accomplished while also removing any hype or marketing and focusing squarely on the technology and it’s usefulness. It’s hard to imagine SOA would gain any great credibility beyond vendor brochures if it wasn’t for Apache’s volunteers showing that.. yes.. there is real technology underneath the rhetoric.

This stack can be viewed in Apache Tuscany, which brings the SDO and SCA specification to us in vendor neutral Java or C++ and Apache ODE (Orchestration Director Engine) which uses WS-BPEL to execute business processes in an SOA similar to IBM’s Websphere Process Server, which I’ve written about here and here. I’ve done work with the Tuscany, including implementing it and managing it’s deployment. Aside from some problems with Websphere (Tuscany uses the EMF libraries from IBM’s Eclipse and Websphere products, so some library version clashes occur in deployment) I can say that it works as advertised and can handle a decent load.

If you couple this with Eclipse’s community contribution on the visual side, with a graphical BPEL editor and the SOA Tools Project, and you have a stack that rivals millions of dollars in IBM license fees for Websphere Process Server and Websphere Integration Developer IDE. Of course, all of these tools don’t come with support and are sure to need some hacking. Regardless, the future is bright for any size company wanting to leverage the best enterprise technology in an open and free development environment.

PHP On Rails?

Friday, August 24th, 2007

One of the new technologies I’ve been playing around with has been Project Zero from IBM. Before everyone gets too excited this software is not free; at least not in any way normally understood by the development community. Although the source code is available, IBM takes some time to explain what it’s use of the term “Community Driven Development” means, but it boils down to this: “this is something we’re going to charge for and we’d just like you to help out for free, please.”

From the About Project Zero page:

We are still building commercial software here, as the licensing makes clear, but we are doing it in a more transparent fashion. This transparency provides a way for you to influence the project much earlier in its lifecycle. It also serves a role in our notions of radical simplicity. Every discussion, every technology decision, the full history of this technology will be accessible, searchable, preserved on this site. That means that finding answers to your questions will never be more than a search away. Development means that this community is about the technology and how it is developed and evolves. This is not a product community. It is not the place for the finished item, but rather the lab where it will grow.

You could be angry with that, and it certainly leaves a bad taste in my mouth. IBM has had a lot of success moving technologies to the open source community, and they have received nothing but benefit as a result. IBM would not have the mindshare of enterprise developers it does today if it had not allowed Eclipse to grow openly and organically.

It has also had success including “torjan horse” frameworks in to Eclipse such as UML2 and EMF, which only see their real fruition in their pricey enterprise products for Rational, Websphere and to a lesser extent Lotus. If you want to see the true power and future of Eclipse as an IDE, you should look at what IBM is doing with their paid products. However, IBM has always been good at folding yesterday’s Rational release back in to Eclipse. For instance, Rational Application Developer 6’s amazing WSDL and Webservice editor tools has found it’s way in to Eclipse Europa as a direct copy, and you can find some XML editor features moving over as well. This is a win/win scenario that promises to keep both developers and IBM happy for a long time to come.

Yet it seems here that IBM has tried to redefine “community driven development” as giving software to developers to create buzz and get free development work from excited community members, and then shove them off like a birthing husk and sell it to large consumers for more than your yearly salary. If I found myself in a community like this, I’d move.

If you choose not to move any farther in to IBM’s work here, it would be understandable, but you’d be missing out on some incredible work. Why not just take a peak?

Project Zero: PHP with REST/AJAX

Project Zero is essentially a Groovy/PHP/Restlet/AJAX framework that tries the “batteries included” approach that Ruby on Rails has with it’s application framework. To be sure, there is a lot that a Ruby or SpringMVC developer will be familiar with when playing with Project Zero, especially it’s Restlet/MVC patterns and it’s scaffolding separated in to an /app, /config and /public directories. You can use Project Zero from the command line, but the real productivity gains are found in using the Eclipse plug-ins. You can program in either Java or PHP as nicely illustrated in the downloads page.

Comparison Between Ruby and Project Zero File Layouts

There are a lot of Restlet/MVC frameworks for Java, which is why I choose to focus on the inclusion of PHP. Simply put, it’s the fastest way for a PHP developer who feels the AJAX/Restlet/Ruby world has pulled away from them to get back in the game building quick and powerful AJAX/Restlet applications. This is all accomplished by the use of the Eclipse PDT project, currently in incubation, which allows for PHP development inside the Eclipse IDE. Project Zero uses this framework to give users right-mouse click control over their entire application.

It also uses Dojo as the JavaScript framework for AJAX. Why this was chosen over others I have no idea other than it’s robust support of file IO and other nice things DHTML doesn’t give you. Dojo has a pretty good following and shouldn’t cause any restrictions on what you can do within your application, although the version included with Project Zero is 0.4.3, which means you won’t get any of the goodness Dojo 0.9 gives you as demonstrated by the cool Mac OSX inspired Fish Eye demo.

Dependency Selection in Project Zero

There are many other libraries that you can use with Project Zero to build a robust application, assuming you’re using Apache Derby, that is. Sadly, although the software libraries in the repository are still in development, there isn’t a lot of juice in the batteries. Further, you can only deploy your PHP / AJAX applications if you have Project Zero installed on your remote environment and running. This means that those host providers who provide simple PHP and php.ini support will not run Project Zero applications you develop.

PHP being RESTful easily

Just like with Ruby on Rails, Project Zero maps RESTful resources to controllers based on names to speed development and reduce resource mapping. Therefore, the RESTful URL: /resources/people calls a people.php file in /app/resources/ that contain the following methods:


People:onList()
People::onCreate()
People::onPutCollection()
People::onDeleteCollection()

and individual resources can be specified as /resources/people/001 (e.g. id= 001 of a person in a people database) which have these methods in the same people.php file:


People::onRetrieve()
People::onUpdate()
People::onPostMember()
People::onDelete()

This makes programming in a RESTful way in PHP as easy as it is in Ruby, but without the language barrier.

JSON in PHP

Much in the same way that the SpringMVC framework uses HttpServlet to pass information back and forth to the calling application, Project Zero prefers to use JSON. This approach is nice since JSON has got a lot of traction lately from AJAX users who don’t want to wait for large XML documents to be sent across the wire. Project Zero exposes an API for PHP that maps roughly to the same HTTPServletRequest/HTTPServletResponse that SpringMVC and Struts uses, called json_encode and json_decode.

Getting data from the POST through JSON is easy enough:

$employee = json_decode($HTTP_RAW_POST_DATA);
$sql = "SELECT * FROM employees WHERE employeeid = ".$employee['id'];

and writing data in JSON is easy as well:

indexedArray = array("a", "b", "c");
$result = json_encode($indexedArray);
echo "Indexed array = " . $result . "
“;

Yet, this method of writing data leaves a lot to be desired. It would be nice if we could forward JSON data and then redirect to a RESTful view, the same way many other frameworks (Ruby on Rails, SpringMVC) does. It turns out you can do this using what Project Zero calls Response Rendering, which is a pattern of APIs to do just that.

In PHP this would be:

$customer = array('name' => 'John Smith');
put('/request/view', 'JSON');
put('/request/json/output', $customer);
render_view();

Wrap-Up

When you think about it, it’s pretty amazing that PHP has been given this much functionality from a product out of IBM’s research. The question on my mind is if PHP even deserves it. Although many websites leverage PHP on the front end, it has always had a mixed history when it comes to separating presentation from the code itself, and is usually re-factored away as an application grows. Further, with Sun and others focusing so heavily on migrating Ruby to the JVM, it makes you wonder why IBM would choose to pick PHP as another language to do web 2.0 in. Remember, IBM made it clear from the above talk about “community development” that they plan on making this a commercial product, so it seems they have settled on Groovy and PHP to be their torch carriers with Project Zero. For PHP developers, they have been given a very powerful toolset. Shame they can’t use it until IBM decides to sell it to them.

My advice? Although large applications such as Wordpress and Facebook use it, for new development PHP’s time has passed. You are better off moving to a Ruby or Java web framework.

My IBM Rational 2007 Presentation on Websphere Process Server

Thursday, July 12th, 2007

Below is an interactive slideshow of my talk given at the IBM Rational conference in 2007 entitled “From Legacy to Service-Oriented Architecture: The Strategic Importance of Services in the Insurance Industry”.

Synopsis: The insurance industry is one of the most complicated industries to manage from an IT perspective. Its complex and highly regulated business rules, as well as its early adoption of mainframes in the 80s and 90s, has led to a significant hurdle in moving existing infrastructures to a Service-Oriented Architecture (SOA). This session shows how the use of IBM Rational tools and IBM(R) Websphere(R) Process Server can free the industry from the complexities of implementing state specific compliance and business workflows through modeling and mediation flows.

This is a presentation I created to describe how SDOs can be used in the Insurance enterprise space to provide sanity in the large and diverse messages. These are increasingly being passed around as Business Objects in a Domain architecture as companies move their old object patterns to a service based approach (I refer to it as servitized business objects).

If you are looking for my particular experience on how SDOs and the IBM EMF framework that contains them works against the large ACORD schema, you can find my critique of Websphere Process Server and ACORD here and the SDO design pattern plugin I wrote for Rational Software Architect here.

Click the image to view this presentation interactively.

You may get the full version in Quicktime Movie format or Microsoft Powerpoint format.