Brandon Werner

Archive for the ‘Service Oriented Architecture’ Category

Typical Architecture Roles in an Enterprise Environment

Monday, June 23rd, 2008

I created the following slide on typical architecture roles and I thought I’d share it.

Typical Architecture Roles in an Enterprise Environment

Typical Architecture Team

Enterprise Architects

Primary role is to manage large scale product and process integrations and determine which products and processes are best suited to deliver on business requirements. They control the large picture of how everything works in an organization and maintains this in a centralized location. They should be experts in software and enterprise design methodologies with experience in how large systems interact and manage data. These architects are essential to competitive and cost-effective decision making and use of technologies.

Water-Cooler Talk: The latest research in to The Staged Event-Driven Architecture for Highly Concurrent Server Applications

Integration Architects

This is an emerging role in larger companies that have large and complicated deployments, particularly around Service Oriented Architecture (SOA). They are usually the ones that have the task of managing Business Processes. Put simply, they tie the software platforms the Software Architect designs together on the environments the Enterprise Architects deliver and purchase. Although Enterprise Architects are typically restricted to existing thinking and technology products, it is the combination of Integration and Software Architects that differentiate an organization and provide maximum benefit.

Water-Cooler Talk: How to change the business workflow so that they can be quicker than their competitors. May need to talk to the Software Architects about how the platform can be changed for quicker processing too.

Software Architects

Primary role is to take architectural directions and artifacts and produce and manage a software platform that provides strategic and operational advantage to an organization. They are usually the ones who maintain the core frameworks of an organization and are considered the gurus of whatever technology they design for. They are very important as they tend to add order and discipline to projects and ensure that best practices, appropriate abstraction and code re-use occurs. These architects are essential to good outsourcing of software development, especially near-shore and off-shore.

Water-Cooler Talk: The latest research in to how Dependency Injection in Java 5 eliminates the need for the Composite Entity pattern in enterprise development.

Gartner Podcast: SOA Lessons Learned From the Trenches

Sunday, June 22nd, 2008

Gartner came out with a good podcast of one of their sessions from the Gartner Enterprise Architecture Summit. It is a very informative panel discussion about “Lessons Learned From the Trenches” with SOA/Enterprise Architecture. Anyone who is an Enterprise Architect or Technology/Business Executive embracing the change and possibilities of SOA in their organization will want to give it a listen. Chances are very good you’ll be vigorously nodding your head and maybe even feeling a little bit better about yourself knowing you are not alone in dealing with these problems.

Much of the conversation is about how to drive SOA adoption (it appears relying on developers to browse a repository is not working), how to measure cost benefits and savings to an organization (hint: measure your service reuse) and how they approach funding services which may not have an exact business advocate and therefore pocketbook to work against.

The last piece comes up all the time in organizations implementing a service oriented architecture. There are many business areas that would benefit from a high level “business service” which would result from the orchestration through externalized business logic (BPM/BPEL) of lower level “technology services”. Yet, if asked who would fund these lower level technology services so that the business services can emerge, the money dries up. Many take on the model of “first to need, first to pay” but that only works when the technology services and service orchestration aren’t that expensive. It’s hard to get project specific business users to fund enterprise wide services for the “greater good”. It’s something that has yet to be solved.

Here is the link.

Thoughts On Google’s Conference on Scalability In Seattle

Monday, June 16th, 2008

Google Scalability Conference LogoIf you are looking for a good collection of notes regarding the topics covered at the Seattle Conference on Scalability, you can do no better than what James Hamilton put together. Instead, I’ll write a quick commentary on what I experienced.

Scalability Is Your Problem Too

The goals of the conference are laudable. Scalability is an issue that almost all practitioners of software engineering face, especially as we move towards offering services both inside and outside the enterprise. Many are taken off guard by the sudden issues that confront them after wiring up a large scale services-based environment; especially around distributing load, distributing the data, and writing the data quickly. Sadly, I didn’t see too many people from large companies there - most were software companies like Microsoft, Google, MySpace and Amazon.com. The attendance may be a consequence of the subject matter. This was some intense stuff dealing with MPI at Cray and its hopeful successor, Wikipedia redone with DHT and Erlang, a b-tree vs. Hashmap debate and scalable storage issues when dealing with billions of files. A more fun loving person would have done better going over to Adobe and hanging out at BarCampSeattle, which was going on at the same time.

Despite the intimidating material, there are real architectural and design issues that these discussions present that should be in the mind of anyone dealing with large datacenters that scale globally or even nationally. The approach of GIGA+ file storage, maidsafe’s new computer architecture, and NetWorkSpaces for the R language was uniform: off-loading responsibility for management of data (meta or otherwise) to all vertices in the deployment graph instead of a central repository. NetWorkSpaces in R and maidsafe even discussed computational scalability - while Cray’s new Chapel language and the discussion around Software Transactional Memory focused on scalability across processing cores as well as machines.

GIGA+ Bitmap Example

GIGA+’s approach of maintaining a small bitmap file on each node and passing that around - while anticipating and accepting stale data on a few edge nodes - was brilliant in the patterns it hinted at, including that perhaps being right all the time isn’t as important as being fast. You can be right most of the time and accept the performance hit of not being right some of the time. There are many people who would cringe at this, but at this point we’re going to have to play loose and leave a few balls up in the air as we juggle - doing the math of how often one may fall while keeping the rest going as fast as we can.

Pay No Attention To The Man Behind The Curtain

Yet if I had to sum up the content of the conference I would say it was big on strategy and architecture but short on implementation. There was a lot of things hinted at “behind the curtain” but nothing assured hand raising from the compsci geeks in the room more than hand waving when you got to the distributed piece of your solution. For instance, one of the big benefits of Chapel - the MPI successor that Bratford Chamberlain of Cray presented - was that you could have distributed arrays and graphs that would be automatically sliced up to be distributed to parallel cores or even other “locales” if desired. How the language determines where to split these large arrays and graphs and farm them out was not discussed. One of the more interesting slides was dashed lines drawn across various nodes and vertices of a graph symbolizing how it would be chopped and distributed. Someone in the audience raised their hand at this - but he moved on and the hand went back down. To be fair, Chapel was called a “multi-resolution” language where one could start fairly abstract and then add more detail and control to get the best desired result - something I assume you have to do to get good or intelligent chopping and distribution of the data. Given that one of his slides was a comparison of code lines between Fortan using MPI and Chapel: seeing a working code snippet of Chapel would have been helpful. It may turn out to be the same amount of work after you get past the “global view”.

This was the trend though, as all of the presentations had a bit of hand waving regarding performance metrics and distribution of computation. This was highlighted by the talk of Vijay Menon of Google - whose work at Intel I was familiar with - discussing Software Transactional Memory. He illustrated the challenges of implementing this in an imperative language (I’m suspicious you can even do STM well in an imperative language with state - as I discussed before) but beyond suggesting the keyword “atomic” to replace “synchronized” in the Java language there was very little real content discussed for those already familiar with the issue of locks and multiprocessors. Concurrent Haskell wasn’t even mentioned. A better introduction and discussion is to be had by watching the O’Reily’s OSCON video from Simon Peyton-Jones (the writer of GHC and now at Microsoft Research) on the subject. After that, if you’re still hungry, his collection of papers on his Microsoft Research site is a delight.

Of course the point of these conferences is the discussions that occur during the breaks and in the networking event afterwards - something that I treasure having newly moved to the Seattle area from Cincinnati. Instead of just observing and blogging from afar - I get to be at the same table as Vijay Menon, Thorsten Schuett, Swapnil Patil, Paul Watson and others.

Summary of the Architectural Patterns I Saw

If I had to summarize what I took away from the conference from a high-level architectural stand-point, here are they are:

  • Every node must be aware of the state of every other node without a centralized controller.
  • To do this, a mechanism should be in place to share state quickly but peer-to-peer.
  • It’s ok to let some nodes go stale.
  • Client/Server is now one thing. Pub/Sub with computation. Every node on the graph should do work.
  • As much as possible, each node should maintain its own security and state. You should be able to have anonymous resources appear in your data center and be put to use without much configuration.
  • As much as possible, abstract the distribution of processing away from programmers.
  • Key,Value with Hashes are best for scalability and distribution (it seems to have won out in all the solutions presented here.) Blame MapReduce.
  • Ants can be used to demonstrate anything.

I hope everyone had a good of a time as I did.

ACM Article: Restful web services vs. big web services: making the right architectural decision

Tuesday, June 3rd, 2008

Great article on ACM regarding when to use REST vs. WS-* standards that are in wide use in SOA architectures today. Very interesting reading for those who may want to take the light-weight approach vs. using the webservice composition and discovery tools that enterprises may find in the TIBCO and IBM SOA stack.

ABSTRACT

Recent technology trends in the Web Services (WS) domain indicate that a solution eliminating the presumed complexity of the WS-* standards may be in sight: advocates of REpresentational State Transfer (REST) have come to believe that their ideas explaining why the World Wide Web works are just as applicable to solve enterprise application integration problems and to simplify the plumbing required to build service-oriented architectures. In this paper we objectify the WS-* vs. REST debate by giving a quantitative technical comparison based on architectural principles and decisions. We show that the two approaches differ in the number of architectural decisions that must be made and in the number of available alternatives. This discrepancy between freedom-from-choice and freedom-of-choice explains the complexity difference perceived. However, we also show that there are significant differences in the consequences of certain decisions in terms of resulting development and maintenance costs. Our comparison helps technical decision makers to assess the two integration styles and technologies more objectively and select the one that best fits their needs: REST is well suited for basic, ad hoc integration scenarios, WS-* is more flexible and addresses advanced quality of service requirements commonly occurring in enterprise computing.

Service Data Objects Architecture: Business Objects with Smarts Presentation

Thursday, April 17th, 2008

This is a presentation I created to describe how SDOs can be used in the Insurance enterprise space to provide sanity in the large and diverse messages. These are increasingly being passed around as Business Objects in a Domain architecture as companies move their old object patterns to a service based approach (I refer to it as servitized business objects).

If you are looking for my particular experience on how SDOs and the IBM EMF framework that contains them works against the large ACORD schema, you can find my critique of Websphere Process Server and ACORD here and the SDO design pattern plugin I wrote for Rational Software Architect here.

You can download the slideshow here.

The Rise Of Functional Programming: F#/Scala/Haskell and the failing of Lisp

Sunday, January 13th, 2008

Over at Lambda The Ultimate, the best academic programming blog on earth, there is a large debate going on regarding what the future of languages will be for 2008. The most important thing to emerge from the discussion is the larger role functional programming will play. It seems like a safe bet. This year has seen the explosion of interest and creation of functional languages such as Apple OS X’s Nu, Java’s JVM using Scala and Microsoft Research’s .Net language F#.

I am ecstatic at this change.

The Failure Of Lisp

It’s hard to understand where it came from. Certainly one can argue the broader academic community had nothing to do with it, the old guard Common Lisp hackers are still as fickle and as judgmental to new comers as ever. Also, the old standards in Lisp languages, Franz and LispWorks have not lowered their prices to anything approachable to the casual developer. There are open source ANSI Lisp implementations without all the supporting engines and functionality, such as SBCL. In fact, my most linked thing I’ve ever written in my career is the installation walk-through I did for installing SBCL and Allegro which includes adding your repository and packages for CLOS and automatically compiling the FASL files, especially dealing with the asdf differences between the implementations. The complexity of this in itself points to problems with portability and configuration in Lisp. However, even that project that targeted Lisp’s Bread and Butter, the parsing of semantic ontologies for the Semantic Web, was met in the message boards with worries on if there would be enough developer participation using such an odd language, and recommendations on moving it to Java.

In reality, Common Lisp showed its failure as a community by sitting out this enthusiasm that has been generated around functional programming languages. It didn’t have to be that way. I recall my first awarenesses of functional programming’s growth was the awesome work of Lemonodor’s blog and Sriram Krishnan posting “Lisp Is Sin“. I was happy at the time that Lisp was getting such attention, as well as functional language architectures in general. I imagined that as OO languages had grown so verbose and feature dense that even the IDEs to develop your applications run in to the tens of gigabytes, a new evolution “Back To The Future” was inevitable. Even more, I believe long suffering Lisp deserves to be back in favor again, it’s certainly spent its time in purgetory. Yet, it didn’t happen. You can blame the old 50 year old men sitting on IRC channels for that. It was the most thorny and un-inspiring community I’ve ever participated in, despite my extreme interest in the language. It’s jaw dropping that a language with such promise has sat out the resurgence, and speaks to what an un-friendly and un-inviting community can do a technology platform. I would be the first to march it off to the grave.

The Rise Of Functional Languages

The interest in functional programming actually grew up around more academic but pure languages like Scheme and Haskell. Although these languages sit within their own island and lack many of the “dirty” aspects of Lisp’s CLOS environment that make it easy to access OS and hardware resources, they are still strikingly useful in learning things that are the staple of functional languages, such as Closures and Lambdas. Indeed, one could argue that the movement to move Closures in to OO languages (first C#, now Java) was in part due to the rise of awareness of functional languages.

Further, it seems to me that functional programming languages answered two prayers of those more ambitious engineers who don’t seem to want to stick with the script and Java worlds they were taught in college. Those two large wins, far more important than the semantic features of functional languages that have gotten all the attention, are architecture foundations of functional languages:

  • Referential Transparency / Side Effects
  • Concurrency

Referential Transparency

To those coming from a pure OO world, Referential Transparency and the restriction of side-effects can be something hard to get their heads around. The best way I describe this concept is by hitting at the root of their assumptions: Everything they deal with are dead. The objects are dead, the variables are dead, the entire atmosphere is dead, as if something had come along and killed everything in your stack and you have to assemble your program by only what’s been given to you, nothing more. There are no instances, objects do not “come alive” and have state; a state that you have to poke in to and a state that can change at any time. A function will always do what you expect, and nothing can come along and change that behavior.

One of the things that seems to appeal to developers most about the promise of SOA architectures happening in enterprise environments, if you’re smart enough to pry it out of them, is that they get the same referential transparency in services. No one can override a service (besides versioning, which is explicit to the developer) and a service will only return what it did earlier in your code and earlier in the year. This forces developers to design services that have the same relationship to the world as functional programmers write their functions for. This is perhaps the trickiest part of migrating enterprise teams to a services based model, their expectations of the mutableness of the services they are accessing and their inability to anticipate what working in that world will be like. Especially for those who use tools or libraries to convert service interaction in to an object, the interaction can be jarring.

However, the soon find the predictability and the safety of such an environment liberating. In much the same way OO programmers were use to making their objects or variables immutable to maintain their contracts and relationships with other objects, often sacrificing many of the benefits that OO programming promised their stack, now they have immutability and transparency in an environment where functional paradigms are key, they do not expect to be able to “embrace and extend” services. They are what they are. This tends to cascade out to the living instantiated code a developer writes as well, as there is no point in entering the world of the living if what you have to return to is a dead function.

This was hinted at in an article in the ACM Queue magazine by Terry Coatta, entitled “From Here to There, The SOA Way“. He states,

Objects are still a very good way to model systems and they function reasonably efficiently in the local context. But they don’t distribute well, particularly if one tries to use them in a naive way. A service-oriented architecture solves this problem by dealing with the latency issues up front. It does this by looking at the patterns of data access in a system and designing the service-layer interfaces to aggregate data in such a way as to optimize bandwidth, usage, and latency.

Not that SOA limitations are the only thing that is affecting the consciousness of a software engineer, the other issue is the large rise in the complexity of managing a large enterprise library written in an OO language. One of the largest pain points of any application of large size is the management of graphs and graphs of live objects and the living data within them. When software engineers experience the lack of side-effects in functional languages, it’s a breath of fresh air.

Concurrency

A funny thing happened on the way to those multi-core processors. People loaded their applications on them and noticed nothing got much faster, particularly when it came to transaction intensive tasks. Turns out Intel and AMD left out an important fact about their Moore’s Law cheating multi-core environment: you can’t ring as much performance out of it without changing the way you manage concurrency and threads. Sequential programming could always rely on going faster as the single processor speed got faster, but as multicores come in to play that isn’t always the case. You want to farm off transactions to occur on separate processors, and in the living world of mutable objects and variables, breaking out two transactions to work concurrently that operate on the same living data is a bad idea. Add structural programming’s solution to this problem, optimistic and pessimistic locking, and you have dead-locks in short order.

Functional programming has been a natural place to explore parallel processing and new ways of doing atomic transactions because of the reasons above. More important, these atomic structures can be composable which is lost when doing locks in structural programming. A lot of the buzz has been generated around the idea of software transactional memory, where execution blocks can be flagged and managed and built upon. The best introduction to this topic is the paper by Tim Harris entitled Concurrent Programming Without Locks. Although this use to be expressed only in the confines of Concurrent Haskell, others have shown how the same techniques can be used in other functional languages, such as F# using nothing more than PowerList.

This experimentation is one of the large reasons why functional languages have become more important as software engineers wrestle with the problems and promise of multi-core processors in transaction processing. Although not every engineer will be interested in the deeper details of STM or other strategies in concurrent programming, the fact that these libraries will emerge and only be available in the functional realm will force software engineers to learn the core concepts and bring even more visibility to the functional programming space.

Functional Hybrids: Functional Programming Is Now Approachable

The other driver for adoption of functional programming languages, besides the architectural benefits it has to solve current problems, is the fact that languages such as F# and Scala have adopted a more hybrid model in their language design, where a developer isn’t forced completely outside her comfort zone. Scala is a combination of functional and deeper OO methodologies (as in SmallTalk) and has access to the entire Java library, significantly reducing the learning curve. The same can be said for F# and .Net and Nu and Objective-C. This does have draw-backs however, as both F# and Scala have not been able to use more of the STM strategies that Concurrent Haskell allows because the underlying thread architecture of the VMs they run against are built for structural programming languages. It is easy to see how this can be fixed, however, and allow those using hybrid functional languages the same power as those who express their ideas in Haskell or even Lisp.

As I said, I am excited about this new resurgence in functional programming languages, and I am enthusiastic 2008 will have even more to offer those who are just getting their toes wet. I personally know some college freshman who started out using Nu as their first language, and are already contributing to the community. The future of software engineering is bright.

Apache Is SOA Ready

Monday, September 10th, 2007

You have to be impressed with Apache’s strategy and execution. Although many enterprise companies have stumbled or confused customers with their numerous SOA offerings in the marketplace, Apache has, behind the scenes, been dutifully executing on the core standards (WS-BPEL 2.0, SDO specification 2.1, SCA specification 1.0) that make up a SOA and plugging them neatly in to an overall stack. All of this has been accomplished while also removing any hype or marketing and focusing squarely on the technology and it’s usefulness. It’s hard to imagine SOA would gain any great credibility beyond vendor brochures if it wasn’t for Apache’s volunteers showing that.. yes.. there is real technology underneath the rhetoric.

This stack can be viewed in Apache Tuscany, which brings the SDO and SCA specification to us in vendor neutral Java or C++ and Apache ODE (Orchestration Director Engine) which uses WS-BPEL to execute business processes in an SOA similar to IBM’s Websphere Process Server, which I’ve written about here and here. I’ve done work with the Tuscany, including implementing it and managing it’s deployment. Aside from some problems with Websphere (Tuscany uses the EMF libraries from IBM’s Eclipse and Websphere products, so some library version clashes occur in deployment) I can say that it works as advertised and can handle a decent load.

If you couple this with Eclipse’s community contribution on the visual side, with a graphical BPEL editor and the SOA Tools Project, and you have a stack that rivals millions of dollars in IBM license fees for Websphere Process Server and Websphere Integration Developer IDE. Of course, all of these tools don’t come with support and are sure to need some hacking. Regardless, the future is bright for any size company wanting to leverage the best enterprise technology in an open and free development environment.

My IBM Rational 2007 Presentation on Websphere Process Server

Thursday, July 12th, 2007

Below is an interactive slideshow of my talk given at the IBM Rational conference in 2007 entitled “From Legacy to Service-Oriented Architecture: The Strategic Importance of Services in the Insurance Industry”.

Synopsis: The insurance industry is one of the most complicated industries to manage from an IT perspective. Its complex and highly regulated business rules, as well as its early adoption of mainframes in the 80s and 90s, has led to a significant hurdle in moving existing infrastructures to a Service-Oriented Architecture (SOA). This session shows how the use of IBM Rational tools and IBM(R) Websphere(R) Process Server can free the industry from the complexities of implementing state specific compliance and business workflows through modeling and mediation flows.

This is a presentation I created to describe how SDOs can be used in the Insurance enterprise space to provide sanity in the large and diverse messages. These are increasingly being passed around as Business Objects in a Domain architecture as companies move their old object patterns to a service based approach (I refer to it as servitized business objects).

If you are looking for my particular experience on how SDOs and the IBM EMF framework that contains them works against the large ACORD schema, you can find my critique of Websphere Process Server and ACORD here and the SDO design pattern plugin I wrote for Rational Software Architect here.

Click the image to view this presentation interactively.

You may get the full version in Quicktime Movie format or Microsoft Powerpoint format.

IBM, SDOs, WPS and SOA Hell

Sunday, June 10th, 2007

I get a lot of emails about my critique of IBM’s Websphere Process Server, mostly along the lines of this email:

I’ve sent this email to you a couple of months back but didn’t get any reply. just thought would try my luck again…

we’ve made some progress with WPS and identified what areas are less risky so we can use at least some of the product features. we’re now building a prototype that uses mostly human tasks and some processes wrapped around them and we expose this all via webservices. there’s some pain around versioning, BPEDB queries, security, etc but at least we’re making a progress and early indicators seem to be on a positive side. we have had a couple of IBMers on site for a week and they helped us somewhat to rejig the design and focus on the features that seem to work and would add most value

still would be extremely interested to find out if you have done more with the product since you blogged about it

Well, obviously for competitive and disclosure reasons I can’t spill the beans on where my current employer is, what they are doing or why they are doing it. With those hand-cuffs in full public display, I will effort a more general entry about the state of affairs of this odd stack IBM has built.

In order to understand IBM’s approach to Websphere Process Server, you have to understand SCA/SDO, the unholy combination that seems to be driving the SOA mythology as of late.

About SDOs

There are currently three different implementations of the SDO spec, including Apache’s Tuscany (the one I’m cheering for), IBM’s from their EMF project for Eclipse and the interesting EclipseLink from Oracle and Interface21 (when those two get in a room together you should listen). EclipseLink is more of a consequence of combining SDOs and JPA, however.

I’m extremely excited about SDOs in general, to the point that sometimes I sit on my deck and dream about them. I even write presentations about their architectural implications. SDOs are awesome things.

They are, in essence, disconnected DataGraphs of DataObjects from some source. This source could be relational databases, entity EJB components, JCA, XML pages, Web services, or a combination of them all.

They can be acted upon and changed, updated and deleted, transformed, even serialized and sent to other objects and then connected back to the original data source with a ChangeSummary() to boot. For a more detailed understanding, check out IBM’s Introduction To Service Data Objects.

This is what they do in a nutshell:

1) You have data in any form from anything: Databases, XML, text, anything (or a combination of them).
2) You feed that data in to a Service Data Object.
3) Things can get pretty loose in there, so you can impose an XSD to define the types or just create the model on your own as you feed data in.
4) You disconnect the SDO from it’s source.

SIDEBAR: Last year I wrote a design pattern for SDOs in Eclipse’s UML2 framework that you can import in to Rational Software Architect.

Now you can do anything with this thing. Change the data inside, add more things to it, remove things from it, serialize it and send it across the wire, query the data inside of it, introspect it, combine it with JPA and tell it to save itself away from it’s source, anything.

Imagine the implications.

Design Strategy: Using SDOs with POJOs

No longer do objects have to have a strong contract, or any contract at all besides taking a type of DataObject. Your Business Objects can introspect the object itself to see if there is data inside of the DataObject it requires to do a task, do work on your behalf, update the DataObject’s DataGraph with new information (or replace existing information), and pass that SDO right back. The calling object can ask for a ChangeSummary() to see what occured to the data in that object, act on that data, and send it right along again to another object. When you are finished passing it around, it can immediately turn back in to the XML or database row it was loaded from.

Your Business Objects can change, the data inside your DataObject can change, or be a totally different DataObject all together. As long as the Business Object can query to see if the data it cares about is in there nothing about the contracts between your objects need change. Further, because of the API that comes with the SDO spec, your object doesn’t really need to know anything about the DataObject it’s feeding data into or getting data from. It can work with the part of the DataGraph it needs in a hands-off way and be done with it.

Think of it from a automobile insurance Use Case:

SDO Walkthrough Diagram Thumbnail

Say that you had a document based webservice that took an automotive insurance policy. This automotive policy would have an XSD that ensured it contained the data necessary for a complete policy. That XML document, once received, would be loaded in to an SDO with the XSD which would “strongly type” the data inside the DataObject that was generated.

Now, imagine this policy was sent to the webservice for a claim to be issued. You could pass this new SDO in to a “claims” system to be processed. The contract for the Claims system is just an SDO (type DataObject). The Claims system would do a simple XPath query, or do a get() on some data that it expected any DataObject that would come in to the system to have (remember, the data inside does have querable types because it was loaded with an XSD). Technically, it could even introspect the SDO DataGraph to find the data it needed. Whatever.

It would probably want to get Customer Information, Units Insured, ect. It would then do it’s own thing looking up the coverages, the limits and going about process to start a claim.

After processing, it could then append it’s update to the SDO by inserting data back in to the DataObject. The DataObject would then be sent back to the webservice, which would change the SDO back in to the XML document (again, the XSD ensures compliance) and sent back to the ESB or some other thing, but now with information on the claim that was created for it.

Now, remember that the Claims system doesn’t really know or care that the SDO is a policy. In fact, the SDO really has no type at all. You could have policy information, a recipe for Apple Pie and an iTunes playlist in the SDO. As long as the Claims system can get the right data from the SDO, it’ll work.

You could just as easily send the Claims system an SDO that only has the data necessary for issuing a claim. You could also change the Claim system so that it requires more information from the SDO, or writes additional information to the SDO (say requirements change).

None of this need impact the contract or the SDO that gets sent to the system.

So essentially, this is what an SDOs used with POJOs alone gives us:

  • Objects are loosely coupled
  • Object’s data can be discovered at runtime through query
  • Object’s contracts are data based, no types or tightly binding interfaces
  • Object is protected from contract and interface changes in system
  • Objects can be placed anywhere - maximum reuse
  • Object can modify it’s types and data structure during runtime
  • Object has no restriction on the data it can share
  • Object is decoupled from the format and source the data came from
  • Object automatically can roll back to the previous data state or act on changes

This is just the impact SDOs have on sharing data between objects. SDOs have much more up their sleeves, including transformations and mapping.

It’s easy to see how this would fit in nicely to the theory of an SOA architecture. If you could just push around self-contained datagraphs of dataobjects, disconnected for their source, and transform them, map them together or change them in to other types of data, a lot of the complexity of dealing with data in an SOA would be solved… or at least abstracted.

SDOs and Websphere Process Server

It’s helpful to keep in mind that all workflow engines (WPS, JBoss jBPM, BlueSpring’s BPM) is a movement of the business logic and process flow from inside the code to outside, and that can only mean better flexibility for organizations going forward, even if current architecture design patterns regarding Business Objects have to change and give up their power or expose it more freely for orchestration (Fowler be damned).

The most intriguing thing about WPS, or any modern workflow engine, is the idea of visually managing the flow and business logic of services in an SOA. WPS promises to separate this orchestration from any code (save the code that it generates) and mix in human tasks to boot.

Workflow tools can be great when you have an Enterpise Service Bus or other solution where you have data flows coming at you from different technologies and different protocols that you want to feed in to a workflow. When you pull in a webservice or other Type that WPS supports in to it’s grahical tool, Websphere Integration Developer, it essentially reverses these data streams or objects in to strong types that are represented visually so that you have common base on which you can orchestrate, transform, and map the data between the various other imported flows. It’s easy to see how it leverages SDOs to do this, as it’s essentially what SDOs are best at. Mapping between these things and orchestrating them are easier if you have an abstract representation of the service or object.

The idea of building an entire enterprise from the ground up with this product is intriguing. However, I doubt that any project already in motion could manage to migrate from business logic in code to business logic in a product without significant rework, and the consultants specializing in WPS (and they are starting to emerge) seem to only have experience starting from scratch and in projects with human interaction. Regardless, this orchestration is most likely where WPS has the most benefit. It’s idea of transformation and consuming of the documents in the mediation itself is where things break down.

In Websphere Process Server, IBM essentially decided to use SDOs to parse and transform any data that comes in to their system. They coupled this with their previous Websphere Business Integrator product (WBI) for business orchestration of the processes that use these SDOs, stapled on an ESB implentation, stuck a feather in it’s cap and called it Maccaroni.

A product being a webservice de-seriailizer, data mapper and transformer, workflow tool and process manager is what ultimately makes Websphere Process Server the strage beast it is, and is also a major source of it’s problems.

In the work I’ve have done with WPS from a purely transformation and webservice flow standpoint, I’ve have found it to be slow but it works. There has been significant problems with the limited amount of flexibility WPS has in processing XML, managing messages and executing processes. IBM pitches that with WPS/WID business analyst or architects would be the ones that would do the mapping and process orchestration using visual tools. Often, however, we have been forced to fall down past the level of abstraction that WPS provides down to the Java code itself, which that is a sign of, if not failure, then immaturity.

Also, Websphere Process Server’s IDE where this mapping takes place visually, Websphere Integration Developer, is (since it is built on top of the Rational/Eclipse platform) hard for non-programmers to grasp; the idea of having to switch perspectives in Eclipse is foreign and unhelpful.

Ye Ol’ Impedance Mismatch

Unfortunately, even when you do go down to the Java level, things aren’t easy.

There are many times that what a WSDL specifies (compliant XML mind you) as a certain “type” and what “type” WPS renders for it are totally incompatible. For XSD:ANYTYPE, it dies flat out. However, so does any WSDL2Java tool. To combat this, code that was using the Collections framework has to be migrated to fixed length arrays so that WPS can interpret the output, as it sees any List as “AnyType”

Certainly, this type mis-match is experienced whenever one jumps from objects to documents in representing data (JSON anyone?), but that just points to the enormity of the problem WPS is trying to solve, and how trying to reverse XSDs without developer intervention is painful and filled with problems.

My Update

So, my update is that it still is not delivering on it’s promises, yet. However, I have heard from IBM that version 7, based off the Rational 7 (Eclipse 3.2) platform should be out in August / September.

I hope the reply was worth the wait.

Eclipse Modeling Framework Examination and that mean Websphere Error.

Thursday, April 19th, 2007

The Eclipse Modeling Framework (EMF) is one of the least understood and most powerful frameworks ever handed down to us mortals by IBM. Often confused with providing UML modeling capability in Eclipse and the Rational toolset (that’s actually the UML2 Framework) EMF concerns models in the meta-data sense, and is, in essence, an abstraction engine for data and code. The fact that Service Data Objects (SDOs), one of the two frameworks of the holy SOA stack (SDO/SCA), is built on top of EMF points to it’s power.

Sadly, most of us often encounter EMF only as some tool failure where we get an .Ecore error or some other nasty issue, and we usually reverse tracks instead of digging in to the mysterious innards of Eclipse. It was in this scenario I discovered a problem with EMF and Websphere, and I’m certain from the message board posts it’s gonna get a whole lot worse before it gets better. Luckily, there is a simple way around it.

The EMF Error

If you do any deployment on Websphere 5.x and higher, particularly it seems now that version 7 of the IBM Rational Platform is out, and use EMF either in SDOs or other parsing technologies, you’re going to run in to this error:


java.lang.IllegalArgumentException: resolve against non-hierarchical or relative base
at org.eclipse.emf.common.util.URI.resolve(URI.java:1853)

This is all the information you get, and it can be maddening. I was forced to face this error recently when trying to load in a xsd file in to an SDO from a relative URL. You can duplicate this problem by running the code below:

public class Test1 {

private static final String PO_MODEL = "po.xsd";
private static final String PO_XML = "po.xml";

private static void definePOTypes() throws Exception {
FileInputStream fis = new FileInputStream(PO_MODEL);
XSDHelper.INSTANCE.define(fis, null); // In Websphere container this will fail
fis.close();
}
}

public static void main(String[] args) throws Exception {
definePOTypes();


FileInputStream fis = new FileOutputStream(PO_XML);
XMLDocumentImpl xmlDoc = XMLHelper.INSTANCE.load(fis);
DataObject purchaseOrder = xmlDoc.getRootObject();

Of particular interest is this line here:


XSDHelper.INSTANCE.define(fis, null); // In Websphere container this will fail

Explanation Of Error And How EMF Determines Runtime Hierarchies

This XSDHelper simply takes an XSD (either String or stream) and prepares the SDO. In essence, the XSD gets parsed so that the SDO can apply the XML or any other data it gets to the XSD for Type validation. This XSDHelper is just a wrapper around EMF, which takes the XSD and begins to fit it in to a meta-model that Java can deal with. What is interesting is what EMF does under the covers, and for this we need to know about the URI class in EMF, and in particular the resolve method.

Recall the class we got the error from:


at org.eclipse.emf.common.util.URI.resolve(URI.java:1853)

The URI class in EMF is simply a representation of a Uniform Resource Identifier (URI), as specified by RFC 2396, with certain enhancements. Like String, URI is an immutable class; a URI instance offers several by-value methods that return a new URI object based on its current state. Most useful, a relative URI can be resolved (that’s the resolve() method in the error we got) against a base absolute URI — the latter typically identifies the document in which the former appears.

It’s finding this absolute base URI that EMF is having the problem with when it says “resolve against non-hierarchical or relative base”. Why does it always want to resolve a base absolute URI? Think of the situation poor EMF is in when it gets a relative URI. What if your XSD refers to other XSDs (po-folk.xsd)? What if it specifies other relative links in relation to it’s own place in an heirarchy(../po-folk.xsd)? EMF can’t deal with that unless it knows exacly where it is in your filesystem or URI landscape.

EMF And Websphere

So, now we know how that works, but what’s wrong with getting the absolute base URI in Websphere?

Well, let’s talk about those “certain enhancements” IBM mentions when talking about the URI class. Even though you probably don’t think about it, almost everything you access in a Java runtime is some kind of archive, no matter if it’s Jar or Zip. Even if you don’t think your working inside a JAR, if you’re running an IDE chances are you are. One enhancement in the URI class of EMF provides support for the hierarchical form used for files within archives, such as the JAR scheme. By default, this support is enabled for absolute URIs with scheme equal to “jar”, “zip”, or “archive” (ignoring case), and is implemented by a hierarchical URI, whose authority includes the entire URI of the archive, up to and including the ! character.

If we would ask EMF for the absolute base URI of the po.xml file in a typical Java application, we would see this:


jar:file:/C:/workspace/test/xsd.resources.jar!/org/eclipse/xsd/po.xsd

This is the URI, with the absolute base URI being “jar:file:/C:”. EMF would be confident all things could be found in relation to this root location.

Now, what happens if we do the same thing in Websphere?


wsjar:file:/C:/workspace/test/xsd.resources.jar!/org/eclipse/xsd/po.xsd

What’s that? A non-standard archive scheme called “wsjar”?. It would appear that IBM, never content with being standard, has an archive scheme that is completely different from one if you were to access a Jar file from outside the container.

What happens when EMF tries to get an absolute base URI when it doesn’t know the archive scheme of the file it’s looking it up for?


java.lang.IllegalArgumentException: resolve against non-hierarchical or relative base
at org.eclipse.emf.common.util.URI.resolve(URI.java:1853)

How To Fix It

Now we know what we’re dealing with. How do we fix it?

Well, all we need to do is let EMF know there is another archive scheme it needs to keep an eye out for, and you can do that two ways.

At the command line:


-Dorg.eclipse.emf.common.util.URI.archiveSchemes="wsjar wszip jar zip"

Or set it up in your WAS configuration as a JVM property.

I hope you learned something interesting about absolute base URIs and stumbled upon this article before you shed too many tears. I don’t know why more people are experiencing this problem with WAS and Rational Application Developer / Rational Software Architect 7, but hopefully IBM will see it fit to provide an EMF in Eclipse that has these new archive schemas built in. Lots of headaches could be solved.