Brandon Werner

Archive for the ‘Enterprise Software’ Category

Google Scalability Conference: Haskell with DHT for Wikipedia / GIGA+ Filesystem

Friday, June 20th, 2008

Google just published some of the slides of the Google Scalability conference online that I attended last weekend and wrote a commentary about earlier this week. The two I’d like to call out are the GIGA+ file system (for storage geeks) and the Software Transactional Memory slides (for software geeks). Also, the ideas presented in the Wikipedia for Haskell / DHT I found really interesting as well.

Just consider it some light reading for your geek weekend.

ACM Article: Restful web services vs. big web services: making the right architectural decision

Tuesday, June 3rd, 2008

Great article on ACM regarding when to use REST vs. WS-* standards that are in wide use in SOA architectures today. Very interesting reading for those who may want to take the light-weight approach vs. using the webservice composition and discovery tools that enterprises may find in the TIBCO and IBM SOA stack.

ABSTRACT

Recent technology trends in the Web Services (WS) domain indicate that a solution eliminating the presumed complexity of the WS-* standards may be in sight: advocates of REpresentational State Transfer (REST) have come to believe that their ideas explaining why the World Wide Web works are just as applicable to solve enterprise application integration problems and to simplify the plumbing required to build service-oriented architectures. In this paper we objectify the WS-* vs. REST debate by giving a quantitative technical comparison based on architectural principles and decisions. We show that the two approaches differ in the number of architectural decisions that must be made and in the number of available alternatives. This discrepancy between freedom-from-choice and freedom-of-choice explains the complexity difference perceived. However, we also show that there are significant differences in the consequences of certain decisions in terms of resulting development and maintenance costs. Our comparison helps technical decision makers to assess the two integration styles and technologies more objectively and select the one that best fits their needs: REST is well suited for basic, ad hoc integration scenarios, WS-* is more flexible and addresses advanced quality of service requirements commonly occurring in enterprise computing.

Fun Lunch Time Distraction: Calculate your googleshare influence

Wednesday, April 30th, 2008

Slurping on some delicious vegetarian chili from the lobby of Safeco Plaza in Seattle, I was just introduced to Googleshare from Atomiq by The Google Operating System post on The Informational Distance Between Cities.

Essentially, it is the (number of search results that contain your name inside search result x / number of total results in search result x = your googleshare (e.g. mindshare))

So, for something you might be known for, like “Java”.. here are the results when ran against my name:

“Brandon Werner” inside “Java” search results = 1,350 (link)
(divided by)
“Java” search results total = 416,000,000 (link)

or

1350 / 416000000 = 3.25.

So my mindshare of the Java market is 3.25%. Not bad at all.

Now, some more mindshare results for you:

Brandon Werner/IBM = 4.17%
Brandon Werner/Semantic Web = 3.85%
Brandon Werner/Insurance = 9.20%

So, I guess I’m kind of hitting my mindshare targets! So what about you? Why not figure out your googleshare and Disqus below?

Service Data Objects Architecture: Business Objects with Smarts Presentation

Thursday, April 17th, 2008

This is a presentation I created to describe how SDOs can be used in the Insurance enterprise space to provide sanity in the large and diverse messages. These are increasingly being passed around as Business Objects in a Domain architecture as companies move their old object patterns to a service based approach (I refer to it as servitized business objects).

If you are looking for my particular experience on how SDOs and the IBM EMF framework that contains them works against the large ACORD schema, you can find my critique of Websphere Process Server and ACORD here and the SDO design pattern plugin I wrote for Rational Software Architect here.

You can download the slideshow here.

The Rise Of Functional Programming: F#/Scala/Haskell and the failing of Lisp

Sunday, January 13th, 2008

Over at Lambda The Ultimate, the best academic programming blog on earth, there is a large debate going on regarding what the future of languages will be for 2008. The most important thing to emerge from the discussion is the larger role functional programming will play. It seems like a safe bet. This year has seen the explosion of interest and creation of functional languages such as Apple OS X’s Nu, Java’s JVM using Scala and Microsoft Research’s .Net language F#.

I am ecstatic at this change.

The Failure Of Lisp

It’s hard to understand where it came from. Certainly one can argue the broader academic community had nothing to do with it, the old guard Common Lisp hackers are still as fickle and as judgmental to new comers as ever. Also, the old standards in Lisp languages, Franz and LispWorks have not lowered their prices to anything approachable to the casual developer. There are open source ANSI Lisp implementations without all the supporting engines and functionality, such as SBCL. In fact, my most linked thing I’ve ever written in my career is the installation walk-through I did for installing SBCL and Allegro which includes adding your repository and packages for CLOS and automatically compiling the FASL files, especially dealing with the asdf differences between the implementations. The complexity of this in itself points to problems with portability and configuration in Lisp. However, even that project that targeted Lisp’s Bread and Butter, the parsing of semantic ontologies for the Semantic Web, was met in the message boards with worries on if there would be enough developer participation using such an odd language, and recommendations on moving it to Java.

In reality, Common Lisp showed its failure as a community by sitting out this enthusiasm that has been generated around functional programming languages. It didn’t have to be that way. I recall my first awarenesses of functional programming’s growth was the awesome work of Lemonodor’s blog and Sriram Krishnan posting “Lisp Is Sin“. I was happy at the time that Lisp was getting such attention, as well as functional language architectures in general. I imagined that as OO languages had grown so verbose and feature dense that even the IDEs to develop your applications run in to the tens of gigabytes, a new evolution “Back To The Future” was inevitable. Even more, I believe long suffering Lisp deserves to be back in favor again, it’s certainly spent its time in purgetory. Yet, it didn’t happen. You can blame the old 50 year old men sitting on IRC channels for that. It was the most thorny and un-inspiring community I’ve ever participated in, despite my extreme interest in the language. It’s jaw dropping that a language with such promise has sat out the resurgence, and speaks to what an un-friendly and un-inviting community can do a technology platform. I would be the first to march it off to the grave.

The Rise Of Functional Languages

The interest in functional programming actually grew up around more academic but pure languages like Scheme and Haskell. Although these languages sit within their own island and lack many of the “dirty” aspects of Lisp’s CLOS environment that make it easy to access OS and hardware resources, they are still strikingly useful in learning things that are the staple of functional languages, such as Closures and Lambdas. Indeed, one could argue that the movement to move Closures in to OO languages (first C#, now Java) was in part due to the rise of awareness of functional languages.

Further, it seems to me that functional programming languages answered two prayers of those more ambitious engineers who don’t seem to want to stick with the script and Java worlds they were taught in college. Those two large wins, far more important than the semantic features of functional languages that have gotten all the attention, are architecture foundations of functional languages:

  • Referential Transparency / Side Effects
  • Concurrency

Referential Transparency

To those coming from a pure OO world, Referential Transparency and the restriction of side-effects can be something hard to get their heads around. The best way I describe this concept is by hitting at the root of their assumptions: Everything they deal with are dead. The objects are dead, the variables are dead, the entire atmosphere is dead, as if something had come along and killed everything in your stack and you have to assemble your program by only what’s been given to you, nothing more. There are no instances, objects do not “come alive” and have state; a state that you have to poke in to and a state that can change at any time. A function will always do what you expect, and nothing can come along and change that behavior.

One of the things that seems to appeal to developers most about the promise of SOA architectures happening in enterprise environments, if you’re smart enough to pry it out of them, is that they get the same referential transparency in services. No one can override a service (besides versioning, which is explicit to the developer) and a service will only return what it did earlier in your code and earlier in the year. This forces developers to design services that have the same relationship to the world as functional programmers write their functions for. This is perhaps the trickiest part of migrating enterprise teams to a services based model, their expectations of the mutableness of the services they are accessing and their inability to anticipate what working in that world will be like. Especially for those who use tools or libraries to convert service interaction in to an object, the interaction can be jarring.

However, the soon find the predictability and the safety of such an environment liberating. In much the same way OO programmers were use to making their objects or variables immutable to maintain their contracts and relationships with other objects, often sacrificing many of the benefits that OO programming promised their stack, now they have immutability and transparency in an environment where functional paradigms are key, they do not expect to be able to “embrace and extend” services. They are what they are. This tends to cascade out to the living instantiated code a developer writes as well, as there is no point in entering the world of the living if what you have to return to is a dead function.

This was hinted at in an article in the ACM Queue magazine by Terry Coatta, entitled “From Here to There, The SOA Way“. He states,

Objects are still a very good way to model systems and they function reasonably efficiently in the local context. But they don’t distribute well, particularly if one tries to use them in a naive way. A service-oriented architecture solves this problem by dealing with the latency issues up front. It does this by looking at the patterns of data access in a system and designing the service-layer interfaces to aggregate data in such a way as to optimize bandwidth, usage, and latency.

Not that SOA limitations are the only thing that is affecting the consciousness of a software engineer, the other issue is the large rise in the complexity of managing a large enterprise library written in an OO language. One of the largest pain points of any application of large size is the management of graphs and graphs of live objects and the living data within them. When software engineers experience the lack of side-effects in functional languages, it’s a breath of fresh air.

Concurrency

A funny thing happened on the way to those multi-core processors. People loaded their applications on them and noticed nothing got much faster, particularly when it came to transaction intensive tasks. Turns out Intel and AMD left out an important fact about their Moore’s Law cheating multi-core environment: you can’t ring as much performance out of it without changing the way you manage concurrency and threads. Sequential programming could always rely on going faster as the single processor speed got faster, but as multicores come in to play that isn’t always the case. You want to farm off transactions to occur on separate processors, and in the living world of mutable objects and variables, breaking out two transactions to work concurrently that operate on the same living data is a bad idea. Add structural programming’s solution to this problem, optimistic and pessimistic locking, and you have dead-locks in short order.

Functional programming has been a natural place to explore parallel processing and new ways of doing atomic transactions because of the reasons above. More important, these atomic structures can be composable which is lost when doing locks in structural programming. A lot of the buzz has been generated around the idea of software transactional memory, where execution blocks can be flagged and managed and built upon. The best introduction to this topic is the paper by Tim Harris entitled Concurrent Programming Without Locks. Although this use to be expressed only in the confines of Concurrent Haskell, others have shown how the same techniques can be used in other functional languages, such as F# using nothing more than PowerList.

This experimentation is one of the large reasons why functional languages have become more important as software engineers wrestle with the problems and promise of multi-core processors in transaction processing. Although not every engineer will be interested in the deeper details of STM or other strategies in concurrent programming, the fact that these libraries will emerge and only be available in the functional realm will force software engineers to learn the core concepts and bring even more visibility to the functional programming space.

Functional Hybrids: Functional Programming Is Now Approachable

The other driver for adoption of functional programming languages, besides the architectural benefits it has to solve current problems, is the fact that languages such as F# and Scala have adopted a more hybrid model in their language design, where a developer isn’t forced completely outside her comfort zone. Scala is a combination of functional and deeper OO methodologies (as in SmallTalk) and has access to the entire Java library, significantly reducing the learning curve. The same can be said for F# and .Net and Nu and Objective-C. This does have draw-backs however, as both F# and Scala have not been able to use more of the STM strategies that Concurrent Haskell allows because the underlying thread architecture of the VMs they run against are built for structural programming languages. It is easy to see how this can be fixed, however, and allow those using hybrid functional languages the same power as those who express their ideas in Haskell or even Lisp.

As I said, I am excited about this new resurgence in functional programming languages, and I am enthusiastic 2008 will have even more to offer those who are just getting their toes wet. I personally know some college freshman who started out using Nu as their first language, and are already contributing to the community. The future of software engineering is bright.

Call To Arms: Our Elders In Computer Science Are Leaving Us

Tuesday, January 8th, 2008

I seem to be visiting a lot of my prior entries from 2005, but a while back I wrote about an experience I had meeting an old IBM programmer at the local Catholic store. She was female, which seemed like a trail-blazing thing to have been working in the heavily male dominated technology industry in the 1950s. Her story, and her eagerness to share with me, led me to write my experiences with gurus in my life, entitled In The Presence of A Guru (On Catholicism, VAX/VMS and Geek Culture). I still feel profoundly stupid for not having spent more time with her, as it was obvious she wanted to share her story. I will regret it for the rest of my life, not just because of what the exchange could have done for my understanding, but more refusing my obligation, the obligation we all have, to listen to and carry on the mythology and history of our elders. I am afraid that mythology may be vanishing, and none of us are listening.

I thought about this again when I stumbled upon a new blog from Dan Weinred as linked from Lemonodor. His career in Computer Science and his contributions certainly qualifies him as a Guru.

His essays so far are a treasure trove of information and computer science archeology. One of my favorites is called The Technology and Business of ObjectStore, where he recounts taking his ObjectStore technology to Microsoft and Bill Gates as well as Steve Jobs. The reaction of both icons demonstrates humorously their approach even now:

When the new Windows technology (which was OS/2 at the time; IBM and Microsoft were still working together on it) came out, it was crucial for us that it be able to support memory mapping. Dave Stryker and Tom Atwood, flew out to meet with Bill Gates in September of 1989. Dave Stryker recalls: “We originally had a 45-minute appointment, but Gates extended the meeting to a couple of hours, and called in Dave Cutler [the architect of OS/2]. At Tom’s urging, we told Gates and Cutler everything they wanted to know about ObjectStore. Gates was complimentary of the Object Design approach, but said, in a nice enough way, that if the Microsoft Empire ever needed such a thing, they would build it themselves. Still, Gates told Cutler to make sure that the OS/2 equivalent to mmap was powerful enough to run ObjectStore, and there were some changes made to make it so.” Later, this OS/2 technology turned into Windows NT. Dave Moon adds that it turned to have a bug: it doesn’t free up disk space when it ought to. For some reason Microsoft hasn’t fixed this, even after many years. We found a way around it.)

Speaking of industry luminaries, we also met with Steve Jobs when he was at NeXT, and Jobs made a big announcement praising our technology, which resulted in a nice press release. There was some discussion that NeXT might buy Object Design, but that never went anywhere.

This type of history is fascinating and important. There have been efforts to capture this knowledge, such as Fokelore.org and ACM’s excellent ACM History Committee trying to archive CS knowledge, but their efforts so far is approaching it the wrong way. It’s getting the stories, while these people are still alive and can tell them, that is important. Much of it will not be academic or well checked, but that is missing the point. It’s the mythology as well as the history of these pioneers in the 1930s - 1970s that we should be after. We’ve already lost Jeff Raskin, Adam Osborne and others.

If we do not do this, their legacy and our history as a profession, one that has changed the world as much as any other in human history, will be lost forever.

Apache Is SOA Ready

Monday, September 10th, 2007

You have to be impressed with Apache’s strategy and execution. Although many enterprise companies have stumbled or confused customers with their numerous SOA offerings in the marketplace, Apache has, behind the scenes, been dutifully executing on the core standards (WS-BPEL 2.0, SDO specification 2.1, SCA specification 1.0) that make up a SOA and plugging them neatly in to an overall stack. All of this has been accomplished while also removing any hype or marketing and focusing squarely on the technology and it’s usefulness. It’s hard to imagine SOA would gain any great credibility beyond vendor brochures if it wasn’t for Apache’s volunteers showing that.. yes.. there is real technology underneath the rhetoric.

This stack can be viewed in Apache Tuscany, which brings the SDO and SCA specification to us in vendor neutral Java or C++ and Apache ODE (Orchestration Director Engine) which uses WS-BPEL to execute business processes in an SOA similar to IBM’s Websphere Process Server, which I’ve written about here and here. I’ve done work with the Tuscany, including implementing it and managing it’s deployment. Aside from some problems with Websphere (Tuscany uses the EMF libraries from IBM’s Eclipse and Websphere products, so some library version clashes occur in deployment) I can say that it works as advertised and can handle a decent load.

If you couple this with Eclipse’s community contribution on the visual side, with a graphical BPEL editor and the SOA Tools Project, and you have a stack that rivals millions of dollars in IBM license fees for Websphere Process Server and Websphere Integration Developer IDE. Of course, all of these tools don’t come with support and are sure to need some hacking. Regardless, the future is bright for any size company wanting to leverage the best enterprise technology in an open and free development environment.

PHP On Rails?

Friday, August 24th, 2007

One of the new technologies I’ve been playing around with has been Project Zero from IBM. Before everyone gets too excited this software is not free; at least not in any way normally understood by the development community. Although the source code is available, IBM takes some time to explain what it’s use of the term “Community Driven Development” means, but it boils down to this: “this is something we’re going to charge for and we’d just like you to help out for free, please.”

From the About Project Zero page:

We are still building commercial software here, as the licensing makes clear, but we are doing it in a more transparent fashion. This transparency provides a way for you to influence the project much earlier in its lifecycle. It also serves a role in our notions of radical simplicity. Every discussion, every technology decision, the full history of this technology will be accessible, searchable, preserved on this site. That means that finding answers to your questions will never be more than a search away. Development means that this community is about the technology and how it is developed and evolves. This is not a product community. It is not the place for the finished item, but rather the lab where it will grow.

You could be angry with that, and it certainly leaves a bad taste in my mouth. IBM has had a lot of success moving technologies to the open source community, and they have received nothing but benefit as a result. IBM would not have the mindshare of enterprise developers it does today if it had not allowed Eclipse to grow openly and organically.

It has also had success including “torjan horse” frameworks in to Eclipse such as UML2 and EMF, which only see their real fruition in their pricey enterprise products for Rational, Websphere and to a lesser extent Lotus. If you want to see the true power and future of Eclipse as an IDE, you should look at what IBM is doing with their paid products. However, IBM has always been good at folding yesterday’s Rational release back in to Eclipse. For instance, Rational Application Developer 6’s amazing WSDL and Webservice editor tools has found it’s way in to Eclipse Europa as a direct copy, and you can find some XML editor features moving over as well. This is a win/win scenario that promises to keep both developers and IBM happy for a long time to come.

Yet it seems here that IBM has tried to redefine “community driven development” as giving software to developers to create buzz and get free development work from excited community members, and then shove them off like a birthing husk and sell it to large consumers for more than your yearly salary. If I found myself in a community like this, I’d move.

If you choose not to move any farther in to IBM’s work here, it would be understandable, but you’d be missing out on some incredible work. Why not just take a peak?

Project Zero: PHP with REST/AJAX

Project Zero is essentially a Groovy/PHP/Restlet/AJAX framework that tries the “batteries included” approach that Ruby on Rails has with it’s application framework. To be sure, there is a lot that a Ruby or SpringMVC developer will be familiar with when playing with Project Zero, especially it’s Restlet/MVC patterns and it’s scaffolding separated in to an /app, /config and /public directories. You can use Project Zero from the command line, but the real productivity gains are found in using the Eclipse plug-ins. You can program in either Java or PHP as nicely illustrated in the downloads page.

Comparison Between Ruby and Project Zero File Layouts

There are a lot of Restlet/MVC frameworks for Java, which is why I choose to focus on the inclusion of PHP. Simply put, it’s the fastest way for a PHP developer who feels the AJAX/Restlet/Ruby world has pulled away from them to get back in the game building quick and powerful AJAX/Restlet applications. This is all accomplished by the use of the Eclipse PDT project, currently in incubation, which allows for PHP development inside the Eclipse IDE. Project Zero uses this framework to give users right-mouse click control over their entire application.

It also uses Dojo as the JavaScript framework for AJAX. Why this was chosen over others I have no idea other than it’s robust support of file IO and other nice things DHTML doesn’t give you. Dojo has a pretty good following and shouldn’t cause any restrictions on what you can do within your application, although the version included with Project Zero is 0.4.3, which means you won’t get any of the goodness Dojo 0.9 gives you as demonstrated by the cool Mac OSX inspired Fish Eye demo.

Dependency Selection in Project Zero

There are many other libraries that you can use with Project Zero to build a robust application, assuming you’re using Apache Derby, that is. Sadly, although the software libraries in the repository are still in development, there isn’t a lot of juice in the batteries. Further, you can only deploy your PHP / AJAX applications if you have Project Zero installed on your remote environment and running. This means that those host providers who provide simple PHP and php.ini support will not run Project Zero applications you develop.

PHP being RESTful easily

Just like with Ruby on Rails, Project Zero maps RESTful resources to controllers based on names to speed development and reduce resource mapping. Therefore, the RESTful URL: /resources/people calls a people.php file in /app/resources/ that contain the following methods:


People:onList()
People::onCreate()
People::onPutCollection()
People::onDeleteCollection()

and individual resources can be specified as /resources/people/001 (e.g. id= 001 of a person in a people database) which have these methods in the same people.php file:


People::onRetrieve()
People::onUpdate()
People::onPostMember()
People::onDelete()

This makes programming in a RESTful way in PHP as easy as it is in Ruby, but without the language barrier.

JSON in PHP

Much in the same way that the SpringMVC framework uses HttpServlet to pass information back and forth to the calling application, Project Zero prefers to use JSON. This approach is nice since JSON has got a lot of traction lately from AJAX users who don’t want to wait for large XML documents to be sent across the wire. Project Zero exposes an API for PHP that maps roughly to the same HTTPServletRequest/HTTPServletResponse that SpringMVC and Struts uses, called json_encode and json_decode.

Getting data from the POST through JSON is easy enough:

$employee = json_decode($HTTP_RAW_POST_DATA);
$sql = "SELECT * FROM employees WHERE employeeid = ".$employee['id'];

and writing data in JSON is easy as well:

indexedArray = array("a", "b", "c");
$result = json_encode($indexedArray);
echo "Indexed array = " . $result . "
“;

Yet, this method of writing data leaves a lot to be desired. It would be nice if we could forward JSON data and then redirect to a RESTful view, the same way many other frameworks (Ruby on Rails, SpringMVC) does. It turns out you can do this using what Project Zero calls Response Rendering, which is a pattern of APIs to do just that.

In PHP this would be:

$customer = array('name' => 'John Smith');
put('/request/view', 'JSON');
put('/request/json/output', $customer);
render_view();

Wrap-Up

When you think about it, it’s pretty amazing that PHP has been given this much functionality from a product out of IBM’s research. The question on my mind is if PHP even deserves it. Although many websites leverage PHP on the front end, it has always had a mixed history when it comes to separating presentation from the code itself, and is usually re-factored away as an application grows. Further, with Sun and others focusing so heavily on migrating Ruby to the JVM, it makes you wonder why IBM would choose to pick PHP as another language to do web 2.0 in. Remember, IBM made it clear from the above talk about “community development” that they plan on making this a commercial product, so it seems they have settled on Groovy and PHP to be their torch carriers with Project Zero. For PHP developers, they have been given a very powerful toolset. Shame they can’t use it until IBM decides to sell it to them.

My advice? Although large applications such as Wordpress and Facebook use it, for new development PHP’s time has passed. You are better off moving to a Ruby or Java web framework.

My IBM Rational 2007 Presentation on Websphere Process Server

Thursday, July 12th, 2007

Below is an interactive slideshow of my talk given at the IBM Rational conference in 2007 entitled “From Legacy to Service-Oriented Architecture: The Strategic Importance of Services in the Insurance Industry”.

Synopsis: The insurance industry is one of the most complicated industries to manage from an IT perspective. Its complex and highly regulated business rules, as well as its early adoption of mainframes in the 80s and 90s, has led to a significant hurdle in moving existing infrastructures to a Service-Oriented Architecture (SOA). This session shows how the use of IBM Rational tools and IBM(R) Websphere(R) Process Server can free the industry from the complexities of implementing state specific compliance and business workflows through modeling and mediation flows.

This is a presentation I created to describe how SDOs can be used in the Insurance enterprise space to provide sanity in the large and diverse messages. These are increasingly being passed around as Business Objects in a Domain architecture as companies move their old object patterns to a service based approach (I refer to it as servitized business objects).

If you are looking for my particular experience on how SDOs and the IBM EMF framework that contains them works against the large ACORD schema, you can find my critique of Websphere Process Server and ACORD here and the SDO design pattern plugin I wrote for Rational Software Architect here.

Click the image to view this presentation interactively.

You may get the full version in Quicktime Movie format or Microsoft Powerpoint format.

IBM, SDOs, WPS and SOA Hell

Sunday, June 10th, 2007

I get a lot of emails about my critique of IBM’s Websphere Process Server, mostly along the lines of this email:

I’ve sent this email to you a couple of months back but didn’t get any reply. just thought would try my luck again…

we’ve made some progress with WPS and identified what areas are less risky so we can use at least some of the product features. we’re now building a prototype that uses mostly human tasks and some processes wrapped around them and we expose this all via webservices. there’s some pain around versioning, BPEDB queries, security, etc but at least we’re making a progress and early indicators seem to be on a positive side. we have had a couple of IBMers on site for a week and they helped us somewhat to rejig the design and focus on the features that seem to work and would add most value

still would be extremely interested to find out if you have done more with the product since you blogged about it

Well, obviously for competitive and disclosure reasons I can’t spill the beans on where my current employer is, what they are doing or why they are doing it. With those hand-cuffs in full public display, I will effort a more general entry about the state of affairs of this odd stack IBM has built.

In order to understand IBM’s approach to Websphere Process Server, you have to understand SCA/SDO, the unholy combination that seems to be driving the SOA mythology as of late.

About SDOs

There are currently three different implementations of the SDO spec, including Apache’s Tuscany (the one I’m cheering for), IBM’s from their EMF project for Eclipse and the interesting EclipseLink from Oracle and Interface21 (when those two get in a room together you should listen). EclipseLink is more of a consequence of combining SDOs and JPA, however.

I’m extremely excited about SDOs in general, to the point that sometimes I sit on my deck and dream about them. I even write presentations about their architectural implications. SDOs are awesome things.

They are, in essence, disconnected DataGraphs of DataObjects from some source. This source could be relational databases, entity EJB components, JCA, XML pages, Web services, or a combination of them all.

They can be acted upon and changed, updated and deleted, transformed, even serialized and sent to other objects and then connected back to the original data source with a ChangeSummary() to boot. For a more detailed understanding, check out IBM’s Introduction To Service Data Objects.

This is what they do in a nutshell:

1) You have data in any form from anything: Databases, XML, text, anything (or a combination of them).
2) You feed that data in to a Service Data Object.
3) Things can get pretty loose in there, so you can impose an XSD to define the types or just create the model on your own as you feed data in.
4) You disconnect the SDO from it’s source.

SIDEBAR: Last year I wrote a design pattern for SDOs in Eclipse’s UML2 framework that you can import in to Rational Software Architect.

Now you can do anything with this thing. Change the data inside, add more things to it, remove things from it, serialize it and send it across the wire, query the data inside of it, introspect it, combine it with JPA and tell it to save itself away from it’s source, anything.

Imagine the implications.

Design Strategy: Using SDOs with POJOs

No longer do objects have to have a strong contract, or any contract at all besides taking a type of DataObject. Your Business Objects can introspect the object itself to see if there is data inside of the DataObject it requires to do a task, do work on your behalf, update the DataObject’s DataGraph with new information (or replace existing information), and pass that SDO right back. The calling object can ask for a ChangeSummary() to see what occured to the data in that object, act on that data, and send it right along again to another object. When you are finished passing it around, it can immediately turn back in to the XML or database row it was loaded from.

Your Business Objects can change, the data inside your DataObject can change, or be a totally different DataObject all together. As long as the Business Object can query to see if the data it cares about is in there nothing about the contracts between your objects need change. Further, because of the API that comes with the SDO spec, your object doesn’t really need to know anything about the DataObject it’s feeding data into or getting data from. It can work with the part of the DataGraph it needs in a hands-off way and be done with it.

Think of it from a automobile insurance Use Case:

SDO Walkthrough Diagram Thumbnail

Say that you had a document based webservice that took an automotive insurance policy. This automotive policy would have an XSD that ensured it contained the data necessary for a complete policy. That XML document, once received, would be loaded in to an SDO with the XSD which would “strongly type” the data inside the DataObject that was generated.

Now, imagine this policy was sent to the webservice for a claim to be issued. You could pass this new SDO in to a “claims” system to be processed. The contract for the Claims system is just an SDO (type DataObject). The Claims system would do a simple XPath query, or do a get() on some data that it expected any DataObject that would come in to the system to have (remember, the data inside does have querable types because it was loaded with an XSD). Technically, it could even introspect the SDO DataGraph to find the data it needed. Whatever.

It would probably want to get Customer Information, Units Insured, ect. It would then do it’s own thing looking up the coverages, the limits and going about process to start a claim.

After processing, it could then append it’s update to the SDO by inserting data back in to the DataObject. The DataObject would then be sent back to the webservice, which would change the SDO back in to the XML document (again, the XSD ensures compliance) and sent back to the ESB or some other thing, but now with information on the claim that was created for it.

Now, remember that the Claims system doesn’t really know or care that the SDO is a policy. In fact, the SDO really has no type at all. You could have policy information, a recipe for Apple Pie and an iTunes playlist in the SDO. As long as the Claims system can get the right data from the SDO, it’ll work.

You could just as easily send the Claims system an SDO that only has the data necessary for issuing a claim. You could also change the Claim system so that it requires more information from the SDO, or writes additional information to the SDO (say requirements change).

None of this need impact the contract or the SDO that gets sent to the system.

So essentially, this is what an SDOs used with POJOs alone gives us:

  • Objects are loosely coupled
  • Object’s data can be discovered at runtime through query
  • Object’s contracts are data based, no types or tightly binding interfaces
  • Object is protected from contract and interface changes in system
  • Objects can be placed anywhere - maximum reuse
  • Object can modify it’s types and data structure during runtime
  • Object has no restriction on the data it can share
  • Object is decoupled from the format and source the data came from
  • Object automatically can roll back to the previous data state or act on changes

This is just the impact SDOs have on sharing data between objects. SDOs have much more up their sleeves, including transformations and mapping.

It’s easy to see how this would fit in nicely to the theory of an SOA architecture. If you could just push around self-contained datagraphs of dataobjects, disconnected for their source, and transform them, map them together or change them in to other types of data, a lot of the complexity of dealing with data in an SOA would be solved… or at least abstracted.

SDOs and Websphere Process Server

It’s helpful to keep in mind that all workflow engines (WPS, JBoss jBPM, BlueSpring’s BPM) is a movement of the business logic and process flow from inside the code to outside, and that can only mean better flexibility for organizations going forward, even if current architecture design patterns regarding Business Objects have to change and give up their power or expose it more freely for orchestration (Fowler be damned).

The most intriguing thing about WPS, or any modern workflow engine, is the idea of visually managing the flow and business logic of services in an SOA. WPS promises to separate this orchestration from any code (save the code that it generates) and mix in human tasks to boot.

Workflow tools can be great when you have an Enterpise Service Bus or other solution where you have data flows coming at you from different technologies and different protocols that you want to feed in to a workflow. When you pull in a webservice or other Type that WPS supports in to it’s grahical tool, Websphere Integration Developer, it essentially reverses these data streams or objects in to strong types that are represented visually so that you have common base on which you can orchestrate, transform, and map the data between the various other imported flows. It’s easy to see how it leverages SDOs to do this, as it’s essentially what SDOs are best at. Mapping between these things and orchestrating them are easier if you have an abstract representation of the service or object.

The idea of building an entire enterprise from the ground up with this product is intriguing. However, I doubt that any project already in motion could manage to migrate from business logic in code to business logic in a product without significant rework, and the consultants specializing in WPS (and they are starting to emerge) seem to only have experience starting from scratch and in projects with human interaction. Regardless, this orchestration is most likely where WPS has the most benefit. It’s idea of transformation and consuming of the documents in the mediation itself is where things break down.

In Websphere Process Server, IBM essentially decided to use SDOs to parse and transform any data that comes in to their system. They coupled this with their previous Websphere Business Integrator product (WBI) for business orchestration of the processes that use these SDOs, stapled on an ESB implentation, stuck a feather in it’s cap and called it Maccaroni.

A product being a webservice de-seriailizer, data mapper and transformer, workflow tool and process manager is what ultimately makes Websphere Process Server the strage beast it is, and is also a major source of it’s problems.

In the work I’ve have done with WPS from a purely transformation and webservice flow standpoint, I’ve have found it to be slow but it works. There has been significant problems with the limited amount of flexibility WPS has in processing XML, managing messages and executing processes. IBM pitches that with WPS/WID business analyst or architects would be the ones that would do the mapping and process orchestration using visual tools. Often, however, we have been forced to fall down past the level of abstraction that WPS provides down to the Java code itself, which that is a sign of, if not failure, then immaturity.

Also, Websphere Process Server’s IDE where this mapping takes place visually, Websphere Integration Developer, is (since it is built on top of the Rational/Eclipse platform) hard for non-programmers to grasp; the idea of having to switch perspectives in Eclipse is foreign and unhelpful.

Ye Ol’ Impedance Mismatch

Unfortunately, even when you do go down to the Java level, things aren’t easy.

There are many times that what a WSDL specifies (compliant XML mind you) as a certain “type” and what “type” WPS renders for it are totally incompatible. For XSD:ANYTYPE, it dies flat out. However, so does any WSDL2Java tool. To combat this, code that was using the Collections framework has to be migrated to fixed length arrays so that WPS can interpret the output, as it sees any List as “AnyType”

Certainly, this type mis-match is experienced whenever one jumps from objects to documents in representing data (JSON anyone?), but that just points to the enormity of the problem WPS is trying to solve, and how trying to reverse XSDs without developer intervention is painful and filled with problems.

My Update

So, my update is that it still is not delivering on it’s promises, yet. However, I have heard from IBM that version 7, based off the Rational 7 (Eclipse 3.2) platform should be out in August / September.

I hope the reply was worth the wait.