Brandon Werner

FYI: The New Communications of the ACM Redesign makes it the best comp sci magazine on earth.

June 29th, 2008

Communications of the ACM July 08 A while ago ACM embarked on an ambitious mission: to change their flagship publication, Communications of the ACM, for Association for Computing Machinery members, in to the JAMA of Computer Science. If this new issue of the re-designed CACM is any indication, they will succeed. In the first few pages we have quantum computing, modeling to eliminate errors in software, an analysis of cloud computing, a debate about the future of the computer science curriculum and what it means for their career path as programming becomes offshored, and the history of the IT industry in India.

.. and I’m only on page 33.

There are 112 pages.

It use to be that way - back from the inception of CACM on through the 1970s the magazine was a collection of computer science research for the academic professional. However, as the 1980s and 1990s moved computers in to people’s homes and the IT field changed from Phds toying with large Turing machines to undergrads who used Visual Basic and Java for basic business purposes, the magazine changed. These new practitioners didn’t come from the academic field, didn’t really understand the basic underpinnings of a computer, and usually didn’t care. The funding of the ACM dried up as well, even as the number of people in the field boomed. The CACM changed to grab these people by becoming more of a mainstream magazine geared towards those new entrants - maybe to attract these people to the ACM membership. It didn’t seem to work. The magazine lost its way.

Now we are once again approaching a change in the computer science field. Much like the way Cloud Computing is taking us back to the large machines in the back rooms and thin clients at the edge, software engineering is changing back from large numbers of engineers with basic knowledge to a smaller number with more specialized knowledge. The Googles of this world are not as worried about basic applications written across millions of detached machines - things that usually create reusable patterns and easy software construction from a weekend’s reading of O’Reilly books. Instead, they are worried about problems of concurrency, massively scalable storage systems and parallel processing while sharing the same memory space. The choice of the language has changed to an implementation detail to express these ideas and can be interchangeable. These problems require knowledge of tuples and binary trees and graph theory, to name a few.

At the same time programming jobs that boomed in the 90s and 00s are being outsourced to cheaper and cheaper labor overseas with the harder proofs being demonstrated once on the internet and then communicated across the world for others to incorporate. Pre-packaged software for businesses are becoming more configurable to existing systems and removing the need for custom software from programmers in non-software companies. This means that those who are serious about the profession are diving deeper in to the roles of architect, designer and academic - while those whom aren’t as interested are moving on to other careers. These two changes are providing an entrance for a journal like CACM to come alive again and publish the best research available needed to solve these hard problems.

The new CACM couldn’t come at a better time.

Typical Architecture Roles in an Enterprise Environment

June 23rd, 2008

I created the following slide on typical architecture roles and I thought I’d share it.

Typical Architecture Roles in an Enterprise Environment

Typical Architecture Team

Enterprise Architects

Primary role is to manage large scale product and process integrations and determine which products and processes are best suited to deliver on business requirements. They control the large picture of how everything works in an organization and maintains this in a centralized location. They should be experts in software and enterprise design methodologies with experience in how large systems interact and manage data. These architects are essential to competitive and cost-effective decision making and use of technologies.

Water-Cooler Talk: The latest research in to The Staged Event-Driven Architecture for Highly Concurrent Server Applications

Integration Architects

This is an emerging role in larger companies that have large and complicated deployments, particularly around Service Oriented Architecture (SOA). They are usually the ones that have the task of managing Business Processes. Put simply, they tie the software platforms the Software Architect designs together on the environments the Enterprise Architects deliver and purchase. Although Enterprise Architects are typically restricted to existing thinking and technology products, it is the combination of Integration and Software Architects that differentiate an organization and provide maximum benefit.

Water-Cooler Talk: How to change the business workflow so that they can be quicker than their competitors. May need to talk to the Software Architects about how the platform can be changed for quicker processing too.

Software Architects

Primary role is to take architectural directions and artifacts and produce and manage a software platform that provides strategic and operational advantage to an organization. They are usually the ones who maintain the core frameworks of an organization and are considered the gurus of whatever technology they design for. They are very important as they tend to add order and discipline to projects and ensure that best practices, appropriate abstraction and code re-use occurs. These architects are essential to good outsourcing of software development, especially near-shore and off-shore.

Water-Cooler Talk: The latest research in to how Dependency Injection in Java 5 eliminates the need for the Composite Entity pattern in enterprise development.

Gartner Podcast: SOA Lessons Learned From the Trenches

June 22nd, 2008

Gartner came out with a good podcast of one of their sessions from the Gartner Enterprise Architecture Summit. It is a very informative panel discussion about “Lessons Learned From the Trenches” with SOA/Enterprise Architecture. Anyone who is an Enterprise Architect or Technology/Business Executive embracing the change and possibilities of SOA in their organization will want to give it a listen. Chances are very good you’ll be vigorously nodding your head and maybe even feeling a little bit better about yourself knowing you are not alone in dealing with these problems.

Much of the conversation is about how to drive SOA adoption (it appears relying on developers to browse a repository is not working), how to measure cost benefits and savings to an organization (hint: measure your service reuse) and how they approach funding services which may not have an exact business advocate and therefore pocketbook to work against.

The last piece comes up all the time in organizations implementing a service oriented architecture. There are many business areas that would benefit from a high level “business service” which would result from the orchestration through externalized business logic (BPM/BPEL) of lower level “technology services”. Yet, if asked who would fund these lower level technology services so that the business services can emerge, the money dries up. Many take on the model of “first to need, first to pay” but that only works when the technology services and service orchestration aren’t that expensive. It’s hard to get project specific business users to fund enterprise wide services for the “greater good”. It’s something that has yet to be solved.

Here is the link.

Google Scalability Conference: Haskell with DHT for Wikipedia / GIGA+ Filesystem

June 20th, 2008

Google just published some of the slides of the Google Scalability conference online that I attended last weekend and wrote a commentary about earlier this week. The two I’d like to call out are the GIGA+ file system (for storage geeks) and the Software Transactional Memory slides (for software geeks). Also, the ideas presented in the Wikipedia for Haskell / DHT I found really interesting as well.

Just consider it some light reading for your geek weekend.

Thoughts On Google’s Conference on Scalability In Seattle

June 16th, 2008

Google Scalability Conference LogoIf you are looking for a good collection of notes regarding the topics covered at the Seattle Conference on Scalability, you can do no better than what James Hamilton put together. Instead, I’ll write a quick commentary on what I experienced.

Scalability Is Your Problem Too

The goals of the conference are laudable. Scalability is an issue that almost all practitioners of software engineering face, especially as we move towards offering services both inside and outside the enterprise. Many are taken off guard by the sudden issues that confront them after wiring up a large scale services-based environment; especially around distributing load, distributing the data, and writing the data quickly. Sadly, I didn’t see too many people from large companies there - most were software companies like Microsoft, Google, MySpace and Amazon.com. The attendance may be a consequence of the subject matter. This was some intense stuff dealing with MPI at Cray and its hopeful successor, Wikipedia redone with DHT and Erlang, a b-tree vs. Hashmap debate and scalable storage issues when dealing with billions of files. A more fun loving person would have done better going over to Adobe and hanging out at BarCampSeattle, which was going on at the same time.

Despite the intimidating material, there are real architectural and design issues that these discussions present that should be in the mind of anyone dealing with large datacenters that scale globally or even nationally. The approach of GIGA+ file storage, maidsafe’s new computer architecture, and NetWorkSpaces for the R language was uniform: off-loading responsibility for management of data (meta or otherwise) to all vertices in the deployment graph instead of a central repository. NetWorkSpaces in R and maidsafe even discussed computational scalability - while Cray’s new Chapel language and the discussion around Software Transactional Memory focused on scalability across processing cores as well as machines.

GIGA+ Bitmap Example

GIGA+’s approach of maintaining a small bitmap file on each node and passing that around - while anticipating and accepting stale data on a few edge nodes - was brilliant in the patterns it hinted at, including that perhaps being right all the time isn’t as important as being fast. You can be right most of the time and accept the performance hit of not being right some of the time. There are many people who would cringe at this, but at this point we’re going to have to play loose and leave a few balls up in the air as we juggle - doing the math of how often one may fall while keeping the rest going as fast as we can.

Pay No Attention To The Man Behind The Curtain

Yet if I had to sum up the content of the conference I would say it was big on strategy and architecture but short on implementation. There was a lot of things hinted at “behind the curtain” but nothing assured hand raising from the compsci geeks in the room more than hand waving when you got to the distributed piece of your solution. For instance, one of the big benefits of Chapel - the MPI successor that Bratford Chamberlain of Cray presented - was that you could have distributed arrays and graphs that would be automatically sliced up to be distributed to parallel cores or even other “locales” if desired. How the language determines where to split these large arrays and graphs and farm them out was not discussed. One of the more interesting slides was dashed lines drawn across various nodes and vertices of a graph symbolizing how it would be chopped and distributed. Someone in the audience raised their hand at this - but he moved on and the hand went back down. To be fair, Chapel was called a “multi-resolution” language where one could start fairly abstract and then add more detail and control to get the best desired result - something I assume you have to do to get good or intelligent chopping and distribution of the data. Given that one of his slides was a comparison of code lines between Fortan using MPI and Chapel: seeing a working code snippet of Chapel would have been helpful. It may turn out to be the same amount of work after you get past the “global view”.

This was the trend though, as all of the presentations had a bit of hand waving regarding performance metrics and distribution of computation. This was highlighted by the talk of Vijay Menon of Google - whose work at Intel I was familiar with - discussing Software Transactional Memory. He illustrated the challenges of implementing this in an imperative language (I’m suspicious you can even do STM well in an imperative language with state - as I discussed before) but beyond suggesting the keyword “atomic” to replace “synchronized” in the Java language there was very little real content discussed for those already familiar with the issue of locks and multiprocessors. Concurrent Haskell wasn’t even mentioned. A better introduction and discussion is to be had by watching the O’Reily’s OSCON video from Simon Peyton-Jones (the writer of GHC and now at Microsoft Research) on the subject. After that, if you’re still hungry, his collection of papers on his Microsoft Research site is a delight.

Of course the point of these conferences is the discussions that occur during the breaks and in the networking event afterwards - something that I treasure having newly moved to the Seattle area from Cincinnati. Instead of just observing and blogging from afar - I get to be at the same table as Vijay Menon, Thorsten Schuett, Swapnil Patil, Paul Watson and others.

Summary of the Architectural Patterns I Saw

If I had to summarize what I took away from the conference from a high-level architectural stand-point, here are they are:

  • Every node must be aware of the state of every other node without a centralized controller.
  • To do this, a mechanism should be in place to share state quickly but peer-to-peer.
  • It’s ok to let some nodes go stale.
  • Client/Server is now one thing. Pub/Sub with computation. Every node on the graph should do work.
  • As much as possible, each node should maintain its own security and state. You should be able to have anonymous resources appear in your data center and be put to use without much configuration.
  • As much as possible, abstract the distribution of processing away from programmers.
  • Key,Value with Hashes are best for scalability and distribution (it seems to have won out in all the solutions presented here.) Blame MapReduce.
  • Ants can be used to demonstrate anything.

I hope everyone had a good of a time as I did.

Presenting Semantic Web at BarCamp Seattle

June 12th, 2008

I will be doing a presentation with Daniel Maycock, an Information Architect from Boeing, at BarCamp Seattle this weekend on the Semantic Web. I’ll be filling in stuff on FoaF, maybe including my Semantic Maestro project and the results of using Google AppEngine if I can get the engine ported over to Python in time. Chances are it’ll just be the idea of FoaF, open social networks, users and tools out on the web.

If you will be there, let me know be sure to say hi!

ACM Article: Restful web services vs. big web services: making the right architectural decision

June 3rd, 2008

Great article on ACM regarding when to use REST vs. WS-* standards that are in wide use in SOA architectures today. Very interesting reading for those who may want to take the light-weight approach vs. using the webservice composition and discovery tools that enterprises may find in the TIBCO and IBM SOA stack.

ABSTRACT

Recent technology trends in the Web Services (WS) domain indicate that a solution eliminating the presumed complexity of the WS-* standards may be in sight: advocates of REpresentational State Transfer (REST) have come to believe that their ideas explaining why the World Wide Web works are just as applicable to solve enterprise application integration problems and to simplify the plumbing required to build service-oriented architectures. In this paper we objectify the WS-* vs. REST debate by giving a quantitative technical comparison based on architectural principles and decisions. We show that the two approaches differ in the number of architectural decisions that must be made and in the number of available alternatives. This discrepancy between freedom-from-choice and freedom-of-choice explains the complexity difference perceived. However, we also show that there are significant differences in the consequences of certain decisions in terms of resulting development and maintenance costs. Our comparison helps technical decision makers to assess the two integration styles and technologies more objectively and select the one that best fits their needs: REST is well suited for basic, ad hoc integration scenarios, WS-* is more flexible and addresses advanced quality of service requirements commonly occurring in enterprise computing.

ACM Article: How Intuitive is Object Oriented Design?

May 17th, 2008

There is an incredible article that was published in the Communications of the ACM entitled “How Intuitive is Object Oriented Design?” by Irit Hadar from the University of Haifa, Israel and Uri Leron from the Israeli Institute of Technology.

It goes through the process of examining the disconnect between intuition and OO design for engineers and software designers.

The object-oriented programming paradigm was created partly to deal with the ever-increasing complexity of software systems. The idea was to exploit the human mind’s natural capabilities for thinking about the world in terms of objects and classes, thus recruiting our intuitive powers for building formal software systems. Indeed, it has commonly been assumed that the intuitive and formal systems of objects and classes are similar and that fluency in the former helps one deal efficiently with the latter. However, recent studies show that object-oriented programming is quite difficult to learn and practice. In this article, we document several such difficulties in the context of experts participating in workshops on object-oriented design (OOD). We use recent research from cognitive psychology to trace the sources of these difficulties to a clash between the intuitive and analytical modes of thinking.

It is currently hidden behind the ACM referred library portal but if you are an ACM member you can access it here.

Franz Responds To The Failure Of Lisp Post - What Platform Will Own Web 3.0?

May 5th, 2008

I took Franz and other Lisp companies to task a few weeks ago in a posting I wrote: The Rise Of Functional Programming: F#/Scala/Haskell and the failing of Lisp:

It’s hard to understand where it came from. Certainly one can argue the broader academic community had nothing to do with it, the old guard Common Lisp hackers are still as fickle and as judgmental to new comers as ever. Also, the old standards in Lisp languages, Franz and LispWorks have not lowered their prices to anything approachable to the casual developer.

Well, I got this email from a Franz representative in response:

Hi Brandon,

My name is Bernard… Very interesting blog, and it does look like you are still working with Lisp, and surprisingly, Semantic Web, too. Any chance you will be down in San Jose for SemTech 2008 this month?…

… I did see the post on Lisp. While we do need to run a business and stay afloat, it’s also in our best interest to have more interesting Lisp and ACL based projects out there. Give me a call if you would like to continue using ACL in your projects and we should try to work something out… I can also set up a temp license if you are interested in our RDF triple store AllegroGraph (http://agraph.franz.com/allegrograph/). The v3.0 release should be in a few weeks and will support both federation and social network analysis tools.
http://agraph.franz.com/support/documentation/3.0/reference-guide.html#header3-65

I am looking forward to talking to you.

First off, dangling the temping temp license for v3.0 of AllegroGraph is not playing fair. I have always been impressed with Allegro’s work in the semantic space set even before it was a popular buzz word. In fact, the same thing that led me to attempt to solve the Semantic problem with Lisp is what led to the same for Franz. If the semantic web and reasoning engines are to become reality, especially on parallel processing architectures, Lisp has to be at the front of the bus. Certainly, others will try to claim they do this just fine with interpreted OO languages with some runtime tweaking - but the problems facing us in the future demand we think differently about how we even construct algorithms to solve our problems. Brute force coding and heavy stacks are not going to get us there.

However, the fact that I’ve always admired it from afar is part of the discussion in my article linked above. It seems simply out of reach for mere mortals to use and incorporate in to their own development plans because of price. Certainly, Allegro deserves to be compensated for their hard work - this isn’t kids hacking PHP for the next Twitter reporting app after all - Allegro has always tackled the big problems where they can contribute value. Not the same thing can be said for many software companies out there.

Regardless, flirting with applications using this model is harkening back to the era of big client-server installations instead of quick and nimble collaborative innovation. As much as Allegro’s marketing may say that AllegroGraph is “Web 3.0″, the principles that drive it are not going to allow its success to be pinned to large engines running in a back room of a well funded company. If I get addicted to the software Allegro has - there is no remedy to bring in on board in my work.

This isn’t to say that Allegro hasn’t opened up to the community - they have opensourced good libraries - although through another license scheme, LLPGL. It also seems they are using the IBM model of “Community Driven Development” I complained about before when IBM released Project Zero to put “PHP on Rails”. They take contributions to fold back in to the Allegro products.

Although I really like working close with Allegro and writing about their accomplishments in this space, I challenge them to think if this is truly the model to gain traction in the coming Web 3.0 world. I would wager the Lisp community should still effort to create more nimble and open components for the semantic web - the internet will demand no less when picking it’s platform for Web 3.0.

Fun Lunch Time Distraction: Calculate your googleshare influence

April 30th, 2008

Slurping on some delicious vegetarian chili from the lobby of Safeco Plaza in Seattle, I was just introduced to Googleshare from Atomiq by The Google Operating System post on The Informational Distance Between Cities.

Essentially, it is the (number of search results that contain your name inside search result x / number of total results in search result x = your googleshare (e.g. mindshare))

So, for something you might be known for, like “Java”.. here are the results when ran against my name:

“Brandon Werner” inside “Java” search results = 1,350 (link)
(divided by)
“Java” search results total = 416,000,000 (link)

or

1350 / 416000000 = 3.25.

So my mindshare of the Java market is 3.25%. Not bad at all.

Now, some more mindshare results for you:

Brandon Werner/IBM = 4.17%
Brandon Werner/Semantic Web = 3.85%
Brandon Werner/Insurance = 9.20%

So, I guess I’m kind of hitting my mindshare targets! So what about you? Why not figure out your googleshare and Disqus below?