Brandon Werner

Archive for the ‘Software Languages’ Category

The Rise Of Functional Programming: F#/Scala/Haskell and the failing of Lisp

Sunday, January 13th, 2008

Over at Lambda The Ultimate, the best academic programming blog on earth, there is a large debate going on regarding what the future of languages will be for 2008. The most important thing to emerge from the discussion is the larger role functional programming will play. It seems like a safe bet. This year has seen the explosion of interest and creation of functional languages such as Apple OS X’s Nu, Java’s JVM using Scala and Microsoft Research’s .Net language F#.

I am ecstatic at this change.

The Failure Of Lisp

It’s hard to understand where it came from. Certainly one can argue the broader academic community had nothing to do with it, the old guard Common Lisp hackers are still as fickle and as judgmental to new comers as ever. Also, the old standards in Lisp languages, Franz and LispWorks have not lowered their prices to anything approachable to the casual developer. There are open source ANSI Lisp implementations without all the supporting engines and functionality, such as SBCL. In fact, my most linked thing I’ve ever written in my career is the installation walk-through I did for installing SBCL and Allegro which includes adding your repository and packages for CLOS and automatically compiling the FASL files, especially dealing with the asdf differences between the implementations. The complexity of this in itself points to problems with portability and configuration in Lisp. However, even that project that targeted Lisp’s Bread and Butter, the parsing of semantic ontologies for the Semantic Web, was met in the message boards with worries on if there would be enough developer participation using such an odd language, and recommendations on moving it to Java.

In reality, Common Lisp showed its failure as a community by sitting out this enthusiasm that has been generated around functional programming languages. It didn’t have to be that way. I recall my first awarenesses of functional programming’s growth was the awesome work of Lemonodor’s blog and Sriram Krishnan posting “Lisp Is Sin“. I was happy at the time that Lisp was getting such attention, as well as functional language architectures in general. I imagined that as OO languages had grown so verbose and feature dense that even the IDEs to develop your applications run in to the tens of gigabytes, a new evolution “Back To The Future” was inevitable. Even more, I believe long suffering Lisp deserves to be back in favor again, it’s certainly spent its time in purgetory. Yet, it didn’t happen. You can blame the old 50 year old men sitting on IRC channels for that. It was the most thorny and un-inspiring community I’ve ever participated in, despite my extreme interest in the language. It’s jaw dropping that a language with such promise has sat out the resurgence, and speaks to what an un-friendly and un-inviting community can do a technology platform. I would be the first to march it off to the grave.

The Rise Of Functional Languages

The interest in functional programming actually grew up around more academic but pure languages like Scheme and Haskell. Although these languages sit within their own island and lack many of the “dirty” aspects of Lisp’s CLOS environment that make it easy to access OS and hardware resources, they are still strikingly useful in learning things that are the staple of functional languages, such as Closures and Lambdas. Indeed, one could argue that the movement to move Closures in to OO languages (first C#, now Java) was in part due to the rise of awareness of functional languages.

Further, it seems to me that functional programming languages answered two prayers of those more ambitious engineers who don’t seem to want to stick with the script and Java worlds they were taught in college. Those two large wins, far more important than the semantic features of functional languages that have gotten all the attention, are architecture foundations of functional languages:

  • Referential Transparency / Side Effects
  • Concurrency

Referential Transparency

To those coming from a pure OO world, Referential Transparency and the restriction of side-effects can be something hard to get their heads around. The best way I describe this concept is by hitting at the root of their assumptions: Everything they deal with are dead. The objects are dead, the variables are dead, the entire atmosphere is dead, as if something had come along and killed everything in your stack and you have to assemble your program by only what’s been given to you, nothing more. There are no instances, objects do not “come alive” and have state; a state that you have to poke in to and a state that can change at any time. A function will always do what you expect, and nothing can come along and change that behavior.

One of the things that seems to appeal to developers most about the promise of SOA architectures happening in enterprise environments, if you’re smart enough to pry it out of them, is that they get the same referential transparency in services. No one can override a service (besides versioning, which is explicit to the developer) and a service will only return what it did earlier in your code and earlier in the year. This forces developers to design services that have the same relationship to the world as functional programmers write their functions for. This is perhaps the trickiest part of migrating enterprise teams to a services based model, their expectations of the mutableness of the services they are accessing and their inability to anticipate what working in that world will be like. Especially for those who use tools or libraries to convert service interaction in to an object, the interaction can be jarring.

However, the soon find the predictability and the safety of such an environment liberating. In much the same way OO programmers were use to making their objects or variables immutable to maintain their contracts and relationships with other objects, often sacrificing many of the benefits that OO programming promised their stack, now they have immutability and transparency in an environment where functional paradigms are key, they do not expect to be able to “embrace and extend” services. They are what they are. This tends to cascade out to the living instantiated code a developer writes as well, as there is no point in entering the world of the living if what you have to return to is a dead function.

This was hinted at in an article in the ACM Queue magazine by Terry Coatta, entitled “From Here to There, The SOA Way“. He states,

Objects are still a very good way to model systems and they function reasonably efficiently in the local context. But they don’t distribute well, particularly if one tries to use them in a naive way. A service-oriented architecture solves this problem by dealing with the latency issues up front. It does this by looking at the patterns of data access in a system and designing the service-layer interfaces to aggregate data in such a way as to optimize bandwidth, usage, and latency.

Not that SOA limitations are the only thing that is affecting the consciousness of a software engineer, the other issue is the large rise in the complexity of managing a large enterprise library written in an OO language. One of the largest pain points of any application of large size is the management of graphs and graphs of live objects and the living data within them. When software engineers experience the lack of side-effects in functional languages, it’s a breath of fresh air.

Concurrency

A funny thing happened on the way to those multi-core processors. People loaded their applications on them and noticed nothing got much faster, particularly when it came to transaction intensive tasks. Turns out Intel and AMD left out an important fact about their Moore’s Law cheating multi-core environment: you can’t ring as much performance out of it without changing the way you manage concurrency and threads. Sequential programming could always rely on going faster as the single processor speed got faster, but as multicores come in to play that isn’t always the case. You want to farm off transactions to occur on separate processors, and in the living world of mutable objects and variables, breaking out two transactions to work concurrently that operate on the same living data is a bad idea. Add structural programming’s solution to this problem, optimistic and pessimistic locking, and you have dead-locks in short order.

Functional programming has been a natural place to explore parallel processing and new ways of doing atomic transactions because of the reasons above. More important, these atomic structures can be composable which is lost when doing locks in structural programming. A lot of the buzz has been generated around the idea of software transactional memory, where execution blocks can be flagged and managed and built upon. The best introduction to this topic is the paper by Tim Harris entitled Concurrent Programming Without Locks. Although this use to be expressed only in the confines of Concurrent Haskell, others have shown how the same techniques can be used in other functional languages, such as F# using nothing more than PowerList.

This experimentation is one of the large reasons why functional languages have become more important as software engineers wrestle with the problems and promise of multi-core processors in transaction processing. Although not every engineer will be interested in the deeper details of STM or other strategies in concurrent programming, the fact that these libraries will emerge and only be available in the functional realm will force software engineers to learn the core concepts and bring even more visibility to the functional programming space.

Functional Hybrids: Functional Programming Is Now Approachable

The other driver for adoption of functional programming languages, besides the architectural benefits it has to solve current problems, is the fact that languages such as F# and Scala have adopted a more hybrid model in their language design, where a developer isn’t forced completely outside her comfort zone. Scala is a combination of functional and deeper OO methodologies (as in SmallTalk) and has access to the entire Java library, significantly reducing the learning curve. The same can be said for F# and .Net and Nu and Objective-C. This does have draw-backs however, as both F# and Scala have not been able to use more of the STM strategies that Concurrent Haskell allows because the underlying thread architecture of the VMs they run against are built for structural programming languages. It is easy to see how this can be fixed, however, and allow those using hybrid functional languages the same power as those who express their ideas in Haskell or even Lisp.

As I said, I am excited about this new resurgence in functional programming languages, and I am enthusiastic 2008 will have even more to offer those who are just getting their toes wet. I personally know some college freshman who started out using Nu as their first language, and are already contributing to the community. The future of software engineering is bright.

Call To Arms: Our Elders In Computer Science Are Leaving Us

Tuesday, January 8th, 2008

I seem to be visiting a lot of my prior entries from 2005, but a while back I wrote about an experience I had meeting an old IBM programmer at the local Catholic store. She was female, which seemed like a trail-blazing thing to have been working in the heavily male dominated technology industry in the 1950s. Her story, and her eagerness to share with me, led me to write my experiences with gurus in my life, entitled In The Presence of A Guru (On Catholicism, VAX/VMS and Geek Culture). I still feel profoundly stupid for not having spent more time with her, as it was obvious she wanted to share her story. I will regret it for the rest of my life, not just because of what the exchange could have done for my understanding, but more refusing my obligation, the obligation we all have, to listen to and carry on the mythology and history of our elders. I am afraid that mythology may be vanishing, and none of us are listening.

I thought about this again when I stumbled upon a new blog from Dan Weinred as linked from Lemonodor. His career in Computer Science and his contributions certainly qualifies him as a Guru.

His essays so far are a treasure trove of information and computer science archeology. One of my favorites is called The Technology and Business of ObjectStore, where he recounts taking his ObjectStore technology to Microsoft and Bill Gates as well as Steve Jobs. The reaction of both icons demonstrates humorously their approach even now:

When the new Windows technology (which was OS/2 at the time; IBM and Microsoft were still working together on it) came out, it was crucial for us that it be able to support memory mapping. Dave Stryker and Tom Atwood, flew out to meet with Bill Gates in September of 1989. Dave Stryker recalls: “We originally had a 45-minute appointment, but Gates extended the meeting to a couple of hours, and called in Dave Cutler [the architect of OS/2]. At Tom’s urging, we told Gates and Cutler everything they wanted to know about ObjectStore. Gates was complimentary of the Object Design approach, but said, in a nice enough way, that if the Microsoft Empire ever needed such a thing, they would build it themselves. Still, Gates told Cutler to make sure that the OS/2 equivalent to mmap was powerful enough to run ObjectStore, and there were some changes made to make it so.” Later, this OS/2 technology turned into Windows NT. Dave Moon adds that it turned to have a bug: it doesn’t free up disk space when it ought to. For some reason Microsoft hasn’t fixed this, even after many years. We found a way around it.)

Speaking of industry luminaries, we also met with Steve Jobs when he was at NeXT, and Jobs made a big announcement praising our technology, which resulted in a nice press release. There was some discussion that NeXT might buy Object Design, but that never went anywhere.

This type of history is fascinating and important. There have been efforts to capture this knowledge, such as Fokelore.org and ACM’s excellent ACM History Committee trying to archive CS knowledge, but their efforts so far is approaching it the wrong way. It’s getting the stories, while these people are still alive and can tell them, that is important. Much of it will not be academic or well checked, but that is missing the point. It’s the mythology as well as the history of these pioneers in the 1930s - 1970s that we should be after. We’ve already lost Jeff Raskin, Adam Osborne and others.

If we do not do this, their legacy and our history as a profession, one that has changed the world as much as any other in human history, will be lost forever.

Facebook, Scoble and Web 2.0: It’s Not The Data, It’s The Work You’ve Put In To It

Sunday, January 6th, 2008

Way back in 2005 I wrote at length about the danger we all face putting data online. Back then, I was running The Planning Studio Inc., and we had experienced a lot of expansion and were outgrowing Salesforce.com as a Sales/Contact manager. When we became aware that extracting not just our data but also the relationships between those data was impossible, I wrote a word of warning to the blogosphere about data. I would have assumed three years later there would have been a common and demanded way to export your data. Sadly, that is not the case. Scoble became a victim of this with Facebook, and what shocked him was what occurred to me back in 2005: In Web 2.0, you don’t own your data.

It’s Not Just About Your Data, It’s About The Relationships Between Them

Although a few bring up the fact there are some ways to export your data from these various services, that missed the point badly. What Scoble and the rest lost, although they may not have been able to articulate it, was the work they put in to their data. When we import our data in to a social network application, we don’t expect to leave it in a static state. We usually import our address book and calendar, among other things, so that we can then begin creating relationships and making our data more valuable.

Think of the tagging feature in Flickr and Facebook. How often have you sat and painstakingly tagged your friends or photos for maximum social use? You may have even organized your photos in to sets that would be more viewable and easy managed, spending hours uploading and tagging. What about the APIs that allow you to visualize, browse and draw conclusions from this metadata that exists from these relationships you’ve made? You may have seen relationships you had never expected. By using social applications on the web, you have changed and made your data more useful, as simply as if you imported a CSV file in to Excel and made graphs of that data.

Does that mean that Microsoft can claim that data and the charts you’ve generated are their property? Can they take those graphs and spreadsheets away at a whim?

The same problem applies to Google and Yahoo!, among others. While many wonder if these Web 2.0 application providers are looking at our confidential data, the real worry is if they decide you can’t anymore. Although Google says publicly they have no process to look at our data we store on their servers, they most certainly have a process to remove you from it if you violate their Terms of Service, something that is at best arbitrary and a process in which you have no legal right to appeal.

The Worse Part Is, Your Hard Work Benefits The Social Network Too

The most insulting part of Facebook and others wielding that sort of power over our meta-data is that it’s through our hardwork their service is useful. Although you could call our work managing and forming relationships between our data a non-zero sum game as we also reap the benefits of the connections, our hard work makes their social websites a better and more fun place to be. If no one took the time to tag their friends in their photos, or enter tags on blog postings and pictures how interesting would these places be? It seems to me that claiming complete control over that data is a slap in the face no user of social networks should tolerate. The same holds true for other Web 2.0 apps that manage our data, including Google and Yahoo!. Deli.cio.us would Just Another Bookmark site if it wasn’t for the hard work it’s users put in to tagging and managing their bookmark data.

Although social networks can be considered non zero-sum, other Web 2.0 companies are decidedly zero-sum; we do all the work. Although there is no social component to these services, it is usually this data and the relationships between them that are more important. Many small businesses use Salesforce.com and have their business contacts linked to companies they bill for services which are in turn linked to billing and payment systems. If they don’t pay their bills for a month, or worse if they wanted to take their business elsewhere, that painstaking work linking these data pieces together, notes and all, would be lost. The same is true with financial analysis of your bank account on Mint or Notebooks on Google.

How To Fix It

Being in software architecture for large enterprise systems, the solution to this seems simple and easy. We need a way to export this data in XML or some other format that, even if it doesn’t contain the content itself (pictures and files would not be practical) the relationships between this data could be exported. It need not be wrapped up in some long drawn-out collaborate standards body in which Flickr, Facebook, Salesforce.com, MySpace and IBM (they join every standards body) would sit down and spend five years coming out with a standard.

My proposal? Facebook just does it.

In the Web 2.0 world, the prime mover is usually the standard maker for the rest of the industry. When Facebook provides it’s users with some export tool, no matter how complicated or mangled the XML would turn out to be, immediately developers would come up with tools to parse and import this data in to other applications and systems. Some would even take this data and provide analytics across various networks, something that other websites are attempting to do but at a substantial risk to your privacy as you must provide them with your login credentials to every site you belong to. Facebook may be bearable to give your login information to, but your Amazon.com profile with your One-Click Buy enabled is quite a different matter.

Other Web 2.0 environments, either through customer demand or marketing reasons, would be quick to follow. Soon, some standard schema would emerge that would be predicable or at least validated online by the services themselves. XML transformation is standard fare in the enterprise world, and there are many C# and Java developers who know how to transform some XML to another XML standard with their eyes closed. It could open up more avenues for migration and populating social networks, as well as doing personal and business analysis of your data.

Whatever the format, as long as the data inside the file is described and well formed (what XML was designed to do, describe the data it contains so any software can make sense of it) we should be able to make quick work of migrating that data to other services or applications.

Why It’s Important

Although we often don’t see it, Scoble, you and I are still “in the bubble” when it comes to technology and social media trends. We often think that because something is important or obvious to us, the rest of the world should be up in arms about it as well, when in reality they are just sitting in the living room watching TV. There are many users of Salesforce.com, Facebook and Google Notebooks that don’t think about this problem, or worse assume that there would have to be a way because it’s just logical there would be. Why would they take your friends and your photos away without giving you any recourse? Why wouldn’t you be able to export your entire client list and company list and maintain those relationships? Even just drag them in to Quickbooks Professional Edition? Wouldn’t there be some law making sure they can do that?

The answer is no. Hopefully, those who have larger voices than I did in 2005 will take this issue to a more public sphere. The danger is that this problem fades back in to the blogosphere, where Scoble gets his traffic and his hits and moves on. I don’t want to write this article again in another three years.

We should do something now, maybe using those same social networks to organize.

How To Install Lisp With Threads on OS X Leopard

Saturday, December 22nd, 2007

Cosmin Stejerean does an awesome job laying out how to get experimental x86 OSX threads support compiled and enabled for SBCL on Apple’s latest operating system. I can’t wait to try this out with my cl-semantic project I’ve been working on off/on for some time. If you’d like to help out, I wrote detailed instructions on how to get up and running if you are using SBCL.

Apache Is SOA Ready

Monday, September 10th, 2007

You have to be impressed with Apache’s strategy and execution. Although many enterprise companies have stumbled or confused customers with their numerous SOA offerings in the marketplace, Apache has, behind the scenes, been dutifully executing on the core standards (WS-BPEL 2.0, SDO specification 2.1, SCA specification 1.0) that make up a SOA and plugging them neatly in to an overall stack. All of this has been accomplished while also removing any hype or marketing and focusing squarely on the technology and it’s usefulness. It’s hard to imagine SOA would gain any great credibility beyond vendor brochures if it wasn’t for Apache’s volunteers showing that.. yes.. there is real technology underneath the rhetoric.

This stack can be viewed in Apache Tuscany, which brings the SDO and SCA specification to us in vendor neutral Java or C++ and Apache ODE (Orchestration Director Engine) which uses WS-BPEL to execute business processes in an SOA similar to IBM’s Websphere Process Server, which I’ve written about here and here. I’ve done work with the Tuscany, including implementing it and managing it’s deployment. Aside from some problems with Websphere (Tuscany uses the EMF libraries from IBM’s Eclipse and Websphere products, so some library version clashes occur in deployment) I can say that it works as advertised and can handle a decent load.

If you couple this with Eclipse’s community contribution on the visual side, with a graphical BPEL editor and the SOA Tools Project, and you have a stack that rivals millions of dollars in IBM license fees for Websphere Process Server and Websphere Integration Developer IDE. Of course, all of these tools don’t come with support and are sure to need some hacking. Regardless, the future is bright for any size company wanting to leverage the best enterprise technology in an open and free development environment.

PHP On Rails?

Friday, August 24th, 2007

One of the new technologies I’ve been playing around with has been Project Zero from IBM. Before everyone gets too excited this software is not free; at least not in any way normally understood by the development community. Although the source code is available, IBM takes some time to explain what it’s use of the term “Community Driven Development” means, but it boils down to this: “this is something we’re going to charge for and we’d just like you to help out for free, please.”

From the About Project Zero page:

We are still building commercial software here, as the licensing makes clear, but we are doing it in a more transparent fashion. This transparency provides a way for you to influence the project much earlier in its lifecycle. It also serves a role in our notions of radical simplicity. Every discussion, every technology decision, the full history of this technology will be accessible, searchable, preserved on this site. That means that finding answers to your questions will never be more than a search away. Development means that this community is about the technology and how it is developed and evolves. This is not a product community. It is not the place for the finished item, but rather the lab where it will grow.

You could be angry with that, and it certainly leaves a bad taste in my mouth. IBM has had a lot of success moving technologies to the open source community, and they have received nothing but benefit as a result. IBM would not have the mindshare of enterprise developers it does today if it had not allowed Eclipse to grow openly and organically.

It has also had success including “torjan horse” frameworks in to Eclipse such as UML2 and EMF, which only see their real fruition in their pricey enterprise products for Rational, Websphere and to a lesser extent Lotus. If you want to see the true power and future of Eclipse as an IDE, you should look at what IBM is doing with their paid products. However, IBM has always been good at folding yesterday’s Rational release back in to Eclipse. For instance, Rational Application Developer 6’s amazing WSDL and Webservice editor tools has found it’s way in to Eclipse Europa as a direct copy, and you can find some XML editor features moving over as well. This is a win/win scenario that promises to keep both developers and IBM happy for a long time to come.

Yet it seems here that IBM has tried to redefine “community driven development” as giving software to developers to create buzz and get free development work from excited community members, and then shove them off like a birthing husk and sell it to large consumers for more than your yearly salary. If I found myself in a community like this, I’d move.

If you choose not to move any farther in to IBM’s work here, it would be understandable, but you’d be missing out on some incredible work. Why not just take a peak?

Project Zero: PHP with REST/AJAX

Project Zero is essentially a Groovy/PHP/Restlet/AJAX framework that tries the “batteries included” approach that Ruby on Rails has with it’s application framework. To be sure, there is a lot that a Ruby or SpringMVC developer will be familiar with when playing with Project Zero, especially it’s Restlet/MVC patterns and it’s scaffolding separated in to an /app, /config and /public directories. You can use Project Zero from the command line, but the real productivity gains are found in using the Eclipse plug-ins. You can program in either Java or PHP as nicely illustrated in the downloads page.

Comparison Between Ruby and Project Zero File Layouts

There are a lot of Restlet/MVC frameworks for Java, which is why I choose to focus on the inclusion of PHP. Simply put, it’s the fastest way for a PHP developer who feels the AJAX/Restlet/Ruby world has pulled away from them to get back in the game building quick and powerful AJAX/Restlet applications. This is all accomplished by the use of the Eclipse PDT project, currently in incubation, which allows for PHP development inside the Eclipse IDE. Project Zero uses this framework to give users right-mouse click control over their entire application.

It also uses Dojo as the JavaScript framework for AJAX. Why this was chosen over others I have no idea other than it’s robust support of file IO and other nice things DHTML doesn’t give you. Dojo has a pretty good following and shouldn’t cause any restrictions on what you can do within your application, although the version included with Project Zero is 0.4.3, which means you won’t get any of the goodness Dojo 0.9 gives you as demonstrated by the cool Mac OSX inspired Fish Eye demo.

Dependency Selection in Project Zero

There are many other libraries that you can use with Project Zero to build a robust application, assuming you’re using Apache Derby, that is. Sadly, although the software libraries in the repository are still in development, there isn’t a lot of juice in the batteries. Further, you can only deploy your PHP / AJAX applications if you have Project Zero installed on your remote environment and running. This means that those host providers who provide simple PHP and php.ini support will not run Project Zero applications you develop.

PHP being RESTful easily

Just like with Ruby on Rails, Project Zero maps RESTful resources to controllers based on names to speed development and reduce resource mapping. Therefore, the RESTful URL: /resources/people calls a people.php file in /app/resources/ that contain the following methods:


People:onList()
People::onCreate()
People::onPutCollection()
People::onDeleteCollection()

and individual resources can be specified as /resources/people/001 (e.g. id= 001 of a person in a people database) which have these methods in the same people.php file:


People::onRetrieve()
People::onUpdate()
People::onPostMember()
People::onDelete()

This makes programming in a RESTful way in PHP as easy as it is in Ruby, but without the language barrier.

JSON in PHP

Much in the same way that the SpringMVC framework uses HttpServlet to pass information back and forth to the calling application, Project Zero prefers to use JSON. This approach is nice since JSON has got a lot of traction lately from AJAX users who don’t want to wait for large XML documents to be sent across the wire. Project Zero exposes an API for PHP that maps roughly to the same HTTPServletRequest/HTTPServletResponse that SpringMVC and Struts uses, called json_encode and json_decode.

Getting data from the POST through JSON is easy enough:

$employee = json_decode($HTTP_RAW_POST_DATA);
$sql = "SELECT * FROM employees WHERE employeeid = ".$employee['id'];

and writing data in JSON is easy as well:

indexedArray = array("a", "b", "c");
$result = json_encode($indexedArray);
echo "Indexed array = " . $result . "
“;

Yet, this method of writing data leaves a lot to be desired. It would be nice if we could forward JSON data and then redirect to a RESTful view, the same way many other frameworks (Ruby on Rails, SpringMVC) does. It turns out you can do this using what Project Zero calls Response Rendering, which is a pattern of APIs to do just that.

In PHP this would be:

$customer = array('name' => 'John Smith');
put('/request/view', 'JSON');
put('/request/json/output', $customer);
render_view();

Wrap-Up

When you think about it, it’s pretty amazing that PHP has been given this much functionality from a product out of IBM’s research. The question on my mind is if PHP even deserves it. Although many websites leverage PHP on the front end, it has always had a mixed history when it comes to separating presentation from the code itself, and is usually re-factored away as an application grows. Further, with Sun and others focusing so heavily on migrating Ruby to the JVM, it makes you wonder why IBM would choose to pick PHP as another language to do web 2.0 in. Remember, IBM made it clear from the above talk about “community development” that they plan on making this a commercial product, so it seems they have settled on Groovy and PHP to be their torch carriers with Project Zero. For PHP developers, they have been given a very powerful toolset. Shame they can’t use it until IBM decides to sell it to them.

My advice? Although large applications such as Wordpress and Facebook use it, for new development PHP’s time has passed. You are better off moving to a Ruby or Java web framework.

My IBM Rational 2007 Presentation on Websphere Process Server

Thursday, July 12th, 2007

Below is an interactive slideshow of my talk given at the IBM Rational conference in 2007 entitled “From Legacy to Service-Oriented Architecture: The Strategic Importance of Services in the Insurance Industry”.

Synopsis: The insurance industry is one of the most complicated industries to manage from an IT perspective. Its complex and highly regulated business rules, as well as its early adoption of mainframes in the 80s and 90s, has led to a significant hurdle in moving existing infrastructures to a Service-Oriented Architecture (SOA). This session shows how the use of IBM Rational tools and IBM(R) Websphere(R) Process Server can free the industry from the complexities of implementing state specific compliance and business workflows through modeling and mediation flows.

This is a presentation I created to describe how SDOs can be used in the Insurance enterprise space to provide sanity in the large and diverse messages. These are increasingly being passed around as Business Objects in a Domain architecture as companies move their old object patterns to a service based approach (I refer to it as servitized business objects).

If you are looking for my particular experience on how SDOs and the IBM EMF framework that contains them works against the large ACORD schema, you can find my critique of Websphere Process Server and ACORD here and the SDO design pattern plugin I wrote for Rational Software Architect here.

Click the image to view this presentation interactively.

You may get the full version in Quicktime Movie format or Microsoft Powerpoint format.

Microsoft Keynote: How To Design Frustrating Products

Tuesday, June 12th, 2007

Better Software Conference in ACM Magazine AdvertisementI really want to put the Science back in to Computer Science, and am an avid member of the ACM and IEEE, although the lines between those are starting to blur. I’ve found that their magazine and website, ACMQueue, is an amazing attempt at putting academics back in to the marketing and Forrester “research” laden stuff that we are forced to contend with in this industry (their latest cover article, API Design Matters, is a good read but not yet up on the site).

A fresh new issue I got to read today, however, made me do a spit-take:

There is a “Better Software Conference and Expo” that is going to be occurring between June 18 - 21st 2007 at Las Vegas. Microsoft, a keynote speaker at the event, apparently hoping that what happens in Vegas stays in Vegas, will be presenting this topic to attendees and wondering why they can’t stop giggling in the audience (click on image to the left and look to the bottom keynotes)

IBM, SDOs, WPS and SOA Hell

Sunday, June 10th, 2007

I get a lot of emails about my critique of IBM’s Websphere Process Server, mostly along the lines of this email:

I’ve sent this email to you a couple of months back but didn’t get any reply. just thought would try my luck again…

we’ve made some progress with WPS and identified what areas are less risky so we can use at least some of the product features. we’re now building a prototype that uses mostly human tasks and some processes wrapped around them and we expose this all via webservices. there’s some pain around versioning, BPEDB queries, security, etc but at least we’re making a progress and early indicators seem to be on a positive side. we have had a couple of IBMers on site for a week and they helped us somewhat to rejig the design and focus on the features that seem to work and would add most value

still would be extremely interested to find out if you have done more with the product since you blogged about it

Well, obviously for competitive and disclosure reasons I can’t spill the beans on where my current employer is, what they are doing or why they are doing it. With those hand-cuffs in full public display, I will effort a more general entry about the state of affairs of this odd stack IBM has built.

In order to understand IBM’s approach to Websphere Process Server, you have to understand SCA/SDO, the unholy combination that seems to be driving the SOA mythology as of late.

About SDOs

There are currently three different implementations of the SDO spec, including Apache’s Tuscany (the one I’m cheering for), IBM’s from their EMF project for Eclipse and the interesting EclipseLink from Oracle and Interface21 (when those two get in a room together you should listen). EclipseLink is more of a consequence of combining SDOs and JPA, however.

I’m extremely excited about SDOs in general, to the point that sometimes I sit on my deck and dream about them. I even write presentations about their architectural implications. SDOs are awesome things.

They are, in essence, disconnected DataGraphs of DataObjects from some source. This source could be relational databases, entity EJB components, JCA, XML pages, Web services, or a combination of them all.

They can be acted upon and changed, updated and deleted, transformed, even serialized and sent to other objects and then connected back to the original data source with a ChangeSummary() to boot. For a more detailed understanding, check out IBM’s Introduction To Service Data Objects.

This is what they do in a nutshell:

1) You have data in any form from anything: Databases, XML, text, anything (or a combination of them).
2) You feed that data in to a Service Data Object.
3) Things can get pretty loose in there, so you can impose an XSD to define the types or just create the model on your own as you feed data in.
4) You disconnect the SDO from it’s source.

SIDEBAR: Last year I wrote a design pattern for SDOs in Eclipse’s UML2 framework that you can import in to Rational Software Architect.

Now you can do anything with this thing. Change the data inside, add more things to it, remove things from it, serialize it and send it across the wire, query the data inside of it, introspect it, combine it with JPA and tell it to save itself away from it’s source, anything.

Imagine the implications.

Design Strategy: Using SDOs with POJOs

No longer do objects have to have a strong contract, or any contract at all besides taking a type of DataObject. Your Business Objects can introspect the object itself to see if there is data inside of the DataObject it requires to do a task, do work on your behalf, update the DataObject’s DataGraph with new information (or replace existing information), and pass that SDO right back. The calling object can ask for a ChangeSummary() to see what occured to the data in that object, act on that data, and send it right along again to another object. When you are finished passing it around, it can immediately turn back in to the XML or database row it was loaded from.

Your Business Objects can change, the data inside your DataObject can change, or be a totally different DataObject all together. As long as the Business Object can query to see if the data it cares about is in there nothing about the contracts between your objects need change. Further, because of the API that comes with the SDO spec, your object doesn’t really need to know anything about the DataObject it’s feeding data into or getting data from. It can work with the part of the DataGraph it needs in a hands-off way and be done with it.

Think of it from a automobile insurance Use Case:

SDO Walkthrough Diagram Thumbnail

Say that you had a document based webservice that took an automotive insurance policy. This automotive policy would have an XSD that ensured it contained the data necessary for a complete policy. That XML document, once received, would be loaded in to an SDO with the XSD which would “strongly type” the data inside the DataObject that was generated.

Now, imagine this policy was sent to the webservice for a claim to be issued. You could pass this new SDO in to a “claims” system to be processed. The contract for the Claims system is just an SDO (type DataObject). The Claims system would do a simple XPath query, or do a get() on some data that it expected any DataObject that would come in to the system to have (remember, the data inside does have querable types because it was loaded with an XSD). Technically, it could even introspect the SDO DataGraph to find the data it needed. Whatever.

It would probably want to get Customer Information, Units Insured, ect. It would then do it’s own thing looking up the coverages, the limits and going about process to start a claim.

After processing, it could then append it’s update to the SDO by inserting data back in to the DataObject. The DataObject would then be sent back to the webservice, which would change the SDO back in to the XML document (again, the XSD ensures compliance) and sent back to the ESB or some other thing, but now with information on the claim that was created for it.

Now, remember that the Claims system doesn’t really know or care that the SDO is a policy. In fact, the SDO really has no type at all. You could have policy information, a recipe for Apple Pie and an iTunes playlist in the SDO. As long as the Claims system can get the right data from the SDO, it’ll work.

You could just as easily send the Claims system an SDO that only has the data necessary for issuing a claim. You could also change the Claim system so that it requires more information from the SDO, or writes additional information to the SDO (say requirements change).

None of this need impact the contract or the SDO that gets sent to the system.

So essentially, this is what an SDOs used with POJOs alone gives us:

  • Objects are loosely coupled
  • Object’s data can be discovered at runtime through query
  • Object’s contracts are data based, no types or tightly binding interfaces
  • Object is protected from contract and interface changes in system
  • Objects can be placed anywhere - maximum reuse
  • Object can modify it’s types and data structure during runtime
  • Object has no restriction on the data it can share
  • Object is decoupled from the format and source the data came from
  • Object automatically can roll back to the previous data state or act on changes

This is just the impact SDOs have on sharing data between objects. SDOs have much more up their sleeves, including transformations and mapping.

It’s easy to see how this would fit in nicely to the theory of an SOA architecture. If you could just push around self-contained datagraphs of dataobjects, disconnected for their source, and transform them, map them together or change them in to other types of data, a lot of the complexity of dealing with data in an SOA would be solved… or at least abstracted.

SDOs and Websphere Process Server

It’s helpful to keep in mind that all workflow engines (WPS, JBoss jBPM, BlueSpring’s BPM) is a movement of the business logic and process flow from inside the code to outside, and that can only mean better flexibility for organizations going forward, even if current architecture design patterns regarding Business Objects have to change and give up their power or expose it more freely for orchestration (Fowler be damned).

The most intriguing thing about WPS, or any modern workflow engine, is the idea of visually managing the flow and business logic of services in an SOA. WPS promises to separate this orchestration from any code (save the code that it generates) and mix in human tasks to boot.

Workflow tools can be great when you have an Enterpise Service Bus or other solution where you have data flows coming at you from different technologies and different protocols that you want to feed in to a workflow. When you pull in a webservice or other Type that WPS supports in to it’s grahical tool, Websphere Integration Developer, it essentially reverses these data streams or objects in to strong types that are represented visually so that you have common base on which you can orchestrate, transform, and map the data between the various other imported flows. It’s easy to see how it leverages SDOs to do this, as it’s essentially what SDOs are best at. Mapping between these things and orchestrating them are easier if you have an abstract representation of the service or object.

The idea of building an entire enterprise from the ground up with this product is intriguing. However, I doubt that any project already in motion could manage to migrate from business logic in code to business logic in a product without significant rework, and the consultants specializing in WPS (and they are starting to emerge) seem to only have experience starting from scratch and in projects with human interaction. Regardless, this orchestration is most likely where WPS has the most benefit. It’s idea of transformation and consuming of the documents in the mediation itself is where things break down.

In Websphere Process Server, IBM essentially decided to use SDOs to parse and transform any data that comes in to their system. They coupled this with their previous Websphere Business Integrator product (WBI) for business orchestration of the processes that use these SDOs, stapled on an ESB implentation, stuck a feather in it’s cap and called it Maccaroni.

A product being a webservice de-seriailizer, data mapper and transformer, workflow tool and process manager is what ultimately makes Websphere Process Server the strage beast it is, and is also a major source of it’s problems.

In the work I’ve have done with WPS from a purely transformation and webservice flow standpoint, I’ve have found it to be slow but it works. There has been significant problems with the limited amount of flexibility WPS has in processing XML, managing messages and executing processes. IBM pitches that with WPS/WID business analyst or architects would be the ones that would do the mapping and process orchestration using visual tools. Often, however, we have been forced to fall down past the level of abstraction that WPS provides down to the Java code itself, which that is a sign of, if not failure, then immaturity.

Also, Websphere Process Server’s IDE where this mapping takes place visually, Websphere Integration Developer, is (since it is built on top of the Rational/Eclipse platform) hard for non-programmers to grasp; the idea of having to switch perspectives in Eclipse is foreign and unhelpful.

Ye Ol’ Impedance Mismatch

Unfortunately, even when you do go down to the Java level, things aren’t easy.

There are many times that what a WSDL specifies (compliant XML mind you) as a certain “type” and what “type” WPS renders for it are totally incompatible. For XSD:ANYTYPE, it dies flat out. However, so does any WSDL2Java tool. To combat this, code that was using the Collections framework has to be migrated to fixed length arrays so that WPS can interpret the output, as it sees any List as “AnyType”

Certainly, this type mis-match is experienced whenever one jumps from objects to documents in representing data (JSON anyone?), but that just points to the enormity of the problem WPS is trying to solve, and how trying to reverse XSDs without developer intervention is painful and filled with problems.

My Update

So, my update is that it still is not delivering on it’s promises, yet. However, I have heard from IBM that version 7, based off the Rational 7 (Eclipse 3.2) platform should be out in August / September.

I hope the reply was worth the wait.

How To Install Haskell Haddock on Mac OS X

Sunday, June 10th, 2007

If you’re scratching around trying to find out how to compile Haddock after getting the source, especially if you are using the awesome EclipseFP plug-in for Eclipse which has it’s use integrated inside the IDE, here are the steps:

First, ensure you have Glasgow Haskell Compiler installed, obviously. It has a OS X binary port and I like it over HUGS.

Second, you should really have Happy and Alex installed as well, as the build will look for it, and they are good to have, but will work fine without it.

from your unzipped directory:

$ ghc -o Setup Setup.lhs -package Cabal
$ ./Setup configure --ghc --prefix=${prefix} --enable-library-profiling
$ ./Setup build -v
$ ./Setup copy --copy-prefix=/usr/local/

Of course, /usr/local could be wherever you want the executable and libs to be installed. It has a /bin and /share.

After this step, you should be able to run Haddock documentation extraction from the Eclipse IDE. If you’re curious how Haskell looks in Eclipse with the EclipseFP plug-in, here it is:

Haskell Eclipse Thumbnail

With module support and outline expansion, it still takes heavily from the OO world (it doesn’t keep this updated if playing with the shell, for instance) it’s still somewhat nice to keep things organized. Of course, the IDE paradigm changes substantially when you are talking about lazy dynamically typed call-by-value languages. Still, for a round peg trying to fit in a square hole, it’s ok.

I still perfer the Haskell editor in the incredible TextMate, however.