Wednesday, August 18, 2010

The Book Of JOSH

[Razie's note] I recently realized that I'm also shifting my mental models trending back to simplicity, so I wanted to re-read this blog...but could only find a copy of it - the original had disappeared. I enjoyed it a lot and figured the more copies the better, so I shamelessly copied the copy below. Enjoy!

For reasons unknown, this article has been removed from the author’s blog, but Google Reader remembered it for me! I have not been able to find the author’s contact information.

All credits to The Grey Lens Man!

The Book Of JOSH

Scala In The Enterprise

Recently, we’ve decided to use Scala as part of an enterprise software solution stack. And I’d like to mention a few things on how the hell that was allowed to happen.

But first lets take a stroll and talk about a thing called enterprise IT.

The Problem

You see, I have a small problem – 3 million lines of RPG and COBOL, 5,500 logical and physical files on a few AS400s that are not exactly cheap. Even better through the years, the system has been cloned and forked so several incompatible versions of exist through out the world.

A few years ago I spent 3 months focused on the “original” implementation in a architectural reverse engineering exercise. The result, I know roughly the raison d’etre for 10% or ~ 300K SLOC of code and a few hundred files. 90% of that self-contained universe of code and tables is my personal dark matter question, I know its out there, just no clue as to its nature.

I won’t go into excruciating detail, but let me leave it like this; within that 2.5 T of data there resides a special flag in the customer file, which determines the fundamental nature of how a customer interacts with the system, and it can be found in Filler3, third byte from the left. In one file, the Account ID column sometimes actually does contain the Account number but not alway, sometimes its something else and its torched us. And you’ll never guess what that ZipLoc3 column_really_ contains (hint: nothing like a zipcode).

Yet this system is responsible for several billion in revenue. If the mainframe goes off line a few pagers chirp, but if that system goes off line, it’s klaxons and battle stations.

Universal agreement, its got to go. It has been end-of-life scheduled more often then a serial killer on death row. IT leadership comes and goes, yet, a full decade later it sits there in the data center laughing at me like an evil essence hosting Steven King basement furnace.

Within the last year of so, it’s reached such a state of chaotic entropy, its decay is not only apparent to IT, but now the business, and worse the customer as well.

What Can Be Done?

No one wants to keep it on life support, so doubling down the head count to reverse entropy, or putting some sort of SOA lipstick on this pig is something we’d like to avoid. The problems are fundamental and systemic.

Being customer facing, it’s where all of the businesses most “brilliant” ideas tend to accumulate through the years. Lets just say they’ve been very creative. Nuff said.

We threw a check at it and a consulting team tried for 18 months to move onto an ERP solution and barely made a dent, though I do believe the team purchased their own private Caribbean Island from the billing fees. It just won’t map to a COTS or ERP package without a struggle and anything less the monetary cost of a government TARP program.

On and off over the years, my boss, out of left field, using the pluralis maiestatis would say “we should rewrite it all from scratch.” Having been broken like John McCain on a few Death Marches I’d look at him repeating “the horror… the horror.”

Then during one of my annual hypomanic phases I’d wake up and just know that if put back together ol’ jelled team from that project back in ‘01, well we could rewrite that sucker in no time. That COCOMO II model, most assuredly did not apply to us. My boss, with a look of concern, would nod sagely and suggest I think about it a bit more. I’d recover my senses in a day or two.

And so it sits in the data center, spewing heat and laughing. And it is evil.

All This Sturm und Drang

As a company we know how to do it by the book as we are several years into an ERP implementation. We have standards documents on how we develop standards, requirements/design/functional templates, change control, copious meetings, PMs, PMO office, analysts, and even a few software developers to actually implement things.

We make RUP look like monkeys with a football, Dilbert but a pale shadow irony of our realism, and Forrestor articles look like descriptions of primitive tribal organization. We have PROCESS.

In addition, we have been a full blown java company for in-house application for years. We got it in spades, Struts, Spring, Hibernate, J2EE, ADF, TOPLINK, Ant, Maven, Eclipse, Rational, …

We drink Kool-Aid by the 55 gallon drum round here.


A number of years ago with the support of a key IT executive, I led a jelled, vertical team which brought Java into the company hard and fast. It was seismic.

There was no end of debate, questions, consternation and gnashing of teeth throughout IT and beyond, which we totally ignored. We didn’t know we were supposed to form a technology introduction committee to shepherd this through the IT Leadership Committee approval process. We damn sure didn’t seek forgiveness after the fact, much less permission before the act.

A wild ride, in a more wild time, this was pre-PROCESS. Small team, modest budget, mega-hours, fun times and a result that the business loved. It gave us the #1 ranking in our industry and we held it for a number of years.

I’m reasonably proficient in and knowledgeable of the Java Universe.

Lets, just say it, Java, by design, is a pretty simplistic language, I would argue even a crippled language for uses other then its original design point, which was certainly not server side enterprise IT. As a result, the cottage industries around Java are practically their own Industrial Sectors.

Java is the Brier Rabbit of IT. Once touched you can’t let go. Its simplistic enough to be inadequate in almost every situation. The commercial world just loves this aspect of Java as they exploit revenue streams from filling these gaps via endless Frameworks, Patterns, APIs, Annotations, IDEs and Toolsets.

The dirty secret of course is 25 – 50% of all of it is pure overhead, without direct value, but necessary to overcome the inherent limitations of Java.

On the other hand, the upside of Java for enterprise IT is pretty obvious. Any problem you might have can be solved with money and armies of plug and play bodies and you get mountains of buzz material for those presentations.

But honestly, is this anyway to deliver the IT solutions your company needs?

Full Circle

Where were we …. Yes, you see, I have a small problem.

So whats the issue, you say? I write a whole blog about nothing, you say? We all know the right answer, you’re pointing out? Yea, I know, its intuitively obvious to the casual observer.

We’ll rewrite it from scratch.

Course we’ll need a cluster of WebSphere Application Servers, and an Oracle RAC cluster for all that data. Don’t forget the middleware needed to transition over from the legacy systems, so toss in an ESB cluster, and what heck a couple of BPEL servers too.

Need a SOA Center of Excellence of course too. Can’t integrate without some common XML Business Object Schemas. Also need to roll the Rational RUP suite and some beefy IDE environments and for that shiny look, sprinkle the works with lots of WS-* sparkly dust. Bake 3-5 years or until done, whenever.

My presentation slides for all this will be killer. I can sell this stuff. I’m good at it. I’ll look like a bloody genius. I’ll have Vendors fawning all over me. And the best part is the bubble on this mess won’t pop for YEARS, when I’ll have plenty of plausible deniability. “Hey the plan was perfect, the business, IT managers and their people were incapable of executing it.”

I feel like the enterprise IT equivalent of an AIG trader pocketing ill gotten gains from writing Credit Default Swaps that we can’t pay off.

Losing My Religion

I’ve lost my faith in it all. I need a new religion.

I don’t want monolithic 10 ton solutions I need to wrestle into place with a small armies. I don’t want clusters of application servers fronting a behemoth RAC data cluster. I don’t want web management consoles which rival the Space Shuttle’s dash board.

I want a simple yet effective structural system where I can select and compose reusable modular solutions into simple, targeted solutions. The solution size should be isomorphic to the problem size. It leverages what it needs to solve the problem and no more.

I don’t want 100K source lines of code, with 33K lines of fluff and stuff. Where one of every in three lines of code has nothing to do with the business logic. Where you can’t even find the business logic in the mounds of patterns, abstractions, frameworks, annotation, cut points and verbosity.

I want the problem domain reflected in the code and the code to capture the essence of the problem domain.

I don’t want massive XML documents constrained by committee designed XSD Schema BODs shuttling around clusters of ESB and BPEL middleware.

I want dirt simple intra system communication in the data center.

The Book Of JOSH

Through a marvelous, even devious, set of circumstances, I’m presented with the opportunity to address my little problem without proscribed constraints, a true green field opportunity.

Json OSGi Scala HTTP

Json delivers on what XML promised. Simple to understand, effective data markup accessible and usable by human and computer alike. Serialization/Deserialization is on par with or faster then XML, Thrift and Protocol Buffers. Sure I’m losing XSD Schema type checking, SOAP and WS-* standardization. I’m taking that trade.

OSGi a standardized dynamic, modular framework for versioned components and services. Pick a logger component, a HTTP server component, a ??? component, add your own internal components and you have a dedicated application solution. Micro deployment with true replacement. What am I giving up? The monolithic J2EE application servlet loaded with 25 frameworks, SCA and XML configuration hell. Taking the trade.

HTTP is simple, effective, fast enough, and widely supported. I’m tired of needlessly complex and endless proprietary protocols to move simple data from A to B with all the accompanying firewall port insanity. Yes, HTTP is not perfect. But I’m taking this trade where I can as well.

All interfaces will be simple REST inspired APIs based on HTTP+JSON. This is an immediate consequence of the JOSH stack.

Scala is by far the toughest, yet the easiest selection in the JOSH stack. I wrestled far more with the JSON or XML or Thrift or Protocol Buffers decision.

Without hesitation I know Scala is the right choice from a pure solutioning aspect. But lets face it, what a tough, tough sell from the propeller headed guys to the pointy headed guys.

First, I’m a bit of a computer language aficionado. I’ve written multiple, actual programs in SML, Scheme, Haskell which I’ve used within the enterprise. Why? Because when faced with certain “one time” problems I can knock out a simple utility far faster XXX then in Java. But in almost all cases these were throw away utilities for one time situations.

I’ve toyed with Dylan, Ruby, Python, Groovy, Lisp, Ocaml, Modula 2, Oberon, …

To date I’ve only advocated and pushed Python for utility scripting at both the Application and System Administration levels. Never, at any time did it ever even cross my mind to advocate anything other then Java and a bit of Python for enterprise application development until now.

Java has been stretched way beyond its modest design point. Its literally falling apart from bloat.

Joshua Boch has numerous presentations of the current state of affairs with Java and the proposed functional extensions and closures are headed. He quotes the following from the Java community.

“I am completely and totally humbled. Laid low. I realize now that I am simply not smart at all. I made the mistake of thinking that I could understand generics. I simply cannot. I just can’t. This is really depressing. It is the first time that I’ve ever not been able to understand something related to computers, in any domain, anywhere, period.”

“I’m the lead architect here, have a PhD in physics, and have been working daily in Java for 10 years and know it pretty well. The other guy is a very senior enterprise developer (wrote an email system that sends 600 million emails/year with almost no maintenance). If we can’t get [generics], it’s highly unlikely that the ‘average’ developer will ever in our lifetimes be able to figure this stuff out.”

That’s just generics. And if you seen the proposed syntax for closures, well its readily apparent, Java’s elastic modulus has been exceeded. A crippled language has been fast marched evolved into a broken language.

Plenty of blame for all here. In the last 50 years academia and commercial entities have given us boutique languages, COBOL, C++ and Java.

A Proper Programming Language For Business Development

It’s the age of the Jetson’s, and a decent programming language for enterprise business applications doesn’t exist.

So lets design one.

  1. Simple, clean and full featured.
  2. Ready of concurrency, distributed applications on mult-core commodity boxes.
  3. Allow for the explicit control of state and state mutation.
  4. Support for modularity, and scaling in the large.
  5. Multiparadigm to support mapping the commonality and variability of the domain problem to the code.
  6. Capable of supporting Application Oriented Language / Domain Specific Language development (AOL/DSL).
  7. Cross platform, with a large supporting tool suite universe.
  8. Open and not subject to Vendor locking.
  9. Fast
  10. OO
  11. Functional
  12. Conceptual Integrity.
  13. Allow control of, if not out right banishment of the null pointer.
  14. Rich libraries.
  15. Strongly Typed
  16. Practical and Pragmatic
  17. Accessible to the average developer, empowering to your A players.
  18. Runs on a portable VM.
  19. Leverage existing extensive Java libraries.

This is Scala.

So How Is It Going

Fast forward… Currently a very small team and myself are near completion of the first major functional component on the JOSH stack.

All of the development talent on the team are experienced Java developers. And they have been effective from Day 1.

No real discussions of covariance and contravariance was required. We did discuss HOF, anonymous lamba, closures, cut syntax, maps, folds, reduces. And strangely their heads did not explode. We did discuss the evil of mutable state, and referentially transparent functions.

They were enthralled.

No detailed lectures on the deep underlying structure of Catamorphism, Anamorphism, Apomorphism, Hylomorphism and Paramorphism was required to get solid code at a cleaner and higher level, with less bloat then equivalent Java.

A core aspect of system is combinator based threading state through a composed computation. No problems in understanding were observed.

In fact in terms of difficulty, they struggled somewhat more with Git then they did with Scala, Linux then they did with Scala, IDEs issues then they did with Scala, and Maven then they did with Scala.

At a minimum they wrote better Java idomatic code in Scala then they did in Java and proactively adopted more idiomatic Scala as time went by.

Visual Basic now has lambda and no one expects a VB developer to throw himself off a building. Yet somehow, these days too many think Java developers can’t handle more advanced functionality.

I’ve seen too much of “I just did this fold thingy and my co-workers could _never_ understand that” is a bit overblown. OK, if they have never seen it before, they might not get it in 10 mins. But working with the team on the basics for 10 hours or 10 days will certainly do it.

But bottom line, enterprise Java developers can transition to Scala. I know this, because I’ve directly observed it.

JOSH has no Data

The JOSH stack is lacking a letter, because a solution for persisted data is missing in the stack.

A great deal of what needs to be done does not require a ACID RDB cluster. Some of it does and I’m kicking that can down the road.

For the rest, either the data is ReadOnly and loaded a 1-3 times a day or is best persisted by a distributed Key-Value storage system. A number of these are now available as open source solutions and at the right moment I’ll need to pick one and add that letter to the JOSH stack.

blog comments powered by Disqus