Paremetric Diversions Revisited

OK, I admit it. After writing about H.S. Lahman’s talk on invariants and parametric polymorphism, I wanted to see what the rest of the world thinks parametric polymorphism is (and isn’t).

The Wikipedia entry for polymorphism states,

“polymorphism is the idea of allowing the same definitions to be used with different types of data (specifically, different classes of objects), resulting in more general and abstract implementations.”

Hold that parenthetical thought—specifically different classes of objects resulting in more general implementations.

Well in the billing example I defined a single generic algorithm that worked on a single class of data (a RateSpec). The clever encoding of the RateSpec threshold and rate tables allowed a single algorithm to cover a wide range of data values: some customers could only be charged a single base rate, others could have different thresholds. The RateSpec object provided a uniform view of this data from the algorithm’s perspective. I wasn’t technically using different classes to represent different encodings. I was being economical by only defining a single class that could encapsulate many different rate and threshold encodings. Is this really parametric polymorphism?

The Wikipedia entry goes on to say:

“Using parametric polymorphism, a function or datatype can be written generically so that it can deal equally well with objects of various types. For example, a function append that joins two lists can be constructed so that it does not depend on one particular type of list: it can append lists of integers, lists of real numbers, lists of strings, and so on. Parametric polymorphism was the first type of polymorphism developed, first identified by Christopher Strachey in 1967. It was also the first type of polymorphism to appear in an actual programming language, ML in 1976. It exists today in Standard ML, O’Caml, Haskell, and others. Some argue that templates should be considered an example of parametric polymorphism, though instead of actually reusing generic code they rely on macros to generate specific code (which can result in code bloat).”

There’s a subtle distinction between modern implementation techniques that are labeled parametric polymorphism and what H.S. was getting at. Before hearing H.S.’s talk, I’d thought of parametric polymorphism as just being another word for parameterized types and/or template classes. That does seem to be a fairly common modern definition. But H.S. Lahman put a slight twist on things. His main point was that careful design of data along with a generic algorithm can accommodate wide variations without lots of classes, case statements or conditional logic. In fact, the key is to design a single, uniform view of seemingly disparate data. In a complex system (the Shreveport LA water rates aren’t complicated enough), different RateSpec implementations would likely be needed. And I suspect that there isn’t one algorithm to calculate rates. But in my simple example, they weren’t. So while I technically might not have been using parametric polymorphism, I did achieve uniformity by encapsulating what varies in a single RateSpec class whose instances would be composed of different rate and threshold table attributes. And that what makes this design simple and flexible.

Parametric polymorphic–or driving behavior with data

At Software Development Best Practices 2005 I attended as many design and architecture talks as I could fit in. I enjoyed H.S. Lahman‘s talk on Invariants and Parametricpolymorphicm. His talk illustrated how instead of abusing inheritance, you can use data to drive general algorithms. This is something I’ve done on many an occasions, and no, I don’t think it violates responsibility-driven design principles. In fact, it is a vital technique for creating extensible designs where extensions can be done externally to your application (by business people creating appropriate data).

An example is worth illustrating the concept. Consider the problem of calculating charges for a water and sewage bill (my first COBOL program as an analyst/programmer for the city of Vancouver, WA made these calculations). Although there were many, many different types of customers (industrial, commercial, residential of various sorts, government, non-profit, etc.), the basic algorithm for computing the customer’s bill was to apply a minimum charge based on the customer type and meter size, then apply a rate stepping function: Base rate + (first tier rate * units in first tier) + (2nd tier rate * units in second tier). Sewage was another similar calculation. Searching the internet, I was happy to find a published similar example of water and sewage rates for the city of Shreveport, Louisiana.

The key to making a single calculation work for a variety of different computations is inventing an encoding for a business rule or algorithm or policy (your invariant) that can be driven by descriptive data that can be treated uniformly by a single algorithm. This data can be encapsulated in a specification object. Of course, given a myriad of Rates and Tier values, these would most likely be encoded in a relational database or some tabular scheme that would be objectified into a specification object.

Computing monthly charges is pretty simple (ignoring floating point arithmetic precision for the sake of simplicity).

Parametric polymorphism is a fancy name for using parameterized data to drive behavior. A bad alternative would be to create many classes of objects to accomplish the same thing. Imagine the nightmare of maintaining a different class for each type of rate and threshold combination! Fact is, though, there is a lot of rate information to maintain. No getting around that. Rather than being hardwired into a program, the rate and threshold specs can be maintained by non-programmers. Even better.

More generically, decision tables can represent information compactly that can be used to drive complex computations. Drools.org offers open source frameworks that support integrating decision tables with Java. If you’ve tried these on a project, I’d be interested in hearing about your experiences.

The key is to effectively exploit objects and data to drive flexible, scalable solutions. If you have really quirky computations and rules that can’t be tamed and codified in a simple manner, you can still exploit these techniques. But you will likely have to untangle and codify several decisions in order to drive several different computations. No one said life is easy. But effectively encoding decisions and variations is key to building flexible, scalable solutions that don’t have to be solved by bruteforce, ugly, rigid code.