This is Google's cache of http://stackoverflow.com/questions/8409822/complete-solutions-to-the-expression-problem. It is a snapshot of the page as it appeared on 4 Feb 2012 14:19:26 GMT. The current page could have changed in the meantime. Learn more

Text-only version
 
oop - Complete solutions to the "Expression Problem‍​​"? - Stack Overflow
up vote 20 down vote favorite
5
share [g+] share [fb]

The Expression Problem is a fundamental issue for achieving extensible programming, especially distributed and large-scale development, but it also applies in the small. Thus a complete solution is very important for the progress of computer science.

There is a related prior question, What is the 'expression problem'?.

I am asking for complete solutions to the Expression Problem (Wadler's definition), as equivalently reformulated below.

The Expression Problem requires the capability to add a subtype (case) that interopts with preexisting function(s), and add a function that operates on both the preexisting subtype and the new subtype. This must be possible without (modifying nor) recompiling the preexisting subtype(s) and function(s), a.k.a. “separate compilation” or “independent extension”. Since static type safety must be retained, there must be no default exceptions, and objects of the extended type must be substitutable for the preexisting non-extended type.

Note: I am adding the vice versa requirement (objects of the preexisting non-extended type must be substitutable for the extended type), because otherwise the extension is not interoperable with the API it is extending.

The Expression Problem doesn't exist in dynamic (a.k.a. statically unityped) languages, because they have only one static type.

I hope the best answer(s) may offer some explicit or implied insight into the tension between the tradeoff of subtyping vs. functional (de)composition.


Dynamic languages (are off-topic)

I not asking about dynamically (untyped) languages. I don't want a subjective debate about the relative merits of dynamic vs. static typing. Perhaps the declarative vs. imperative tradeoff is as important and controversial.

I can state objectively that not having the Expression Problem doesn't necessarily make dynamically typed languages more extensible, it just means the invariants of extension are not checked by the compiler. So in that sense, dynamic languages never solve the Expression Problem.

I can also state objectively that new errors due to unchecked data type assumptions can manifest both in the preexisting code, as well as the new code. Thus unit testing can't catch all the errors (but to be fair, neither will static typing).

The only 'untyped solution' I know of is an oxymoron-- to limit extension to what is anticipated, i.e. ad-hoc, simulated typing at runtime. Or one could have started with a sound static type system. pay me now or pay me later may apply, but your mileage may vary.

link|improve this question
@StefanKendall Afaik, nails galore of boilerplate tsuris to smash away (including at the language design layer, see my answer), Adapter pattern, Decorator pattern, Visitor pattern, extending an Interpreter pattern, … or did I not get your point? Afaik, the issue is fundamental and pervasive. – Shelby Moore III Dec 7 '11 at 3:34
@StefanKendall here is an example of extinguishing boilerplate with multimethods language solution to the Expression Problem. – Shelby Moore III Dec 7 '11 at 4:22
2  
Putting a WORD JOINER between 'P' and 'r'. Clever. – Prateek Dec 7 '11 at 4:49
@Prateek who provided that edit? I originally had 'Rroblem' in the title :) – Shelby Moore III Dec 7 '11 at 5:16
It was edited by cobbal, as you can see in the revision history. – Prateek Dec 7 '11 at 5:53
show 6 more comments
feedback

closed as not constructive by Andrew Barber, C. A. McCann, sclv, 一二三, Graviton Dec 10 '11 at 8:45

This question is not a good fit to our Q&A format. We expect answers to generally involve facts, references, or specific expertise; this question will likely solicit opinion, debate, arguments, polling, or extended discussion. See the FAQ.

3 Answers

Let's build from partial solutions using the Scala language, then end with Haskell.

Add Function but not Case

Given a preexisting supertype Interface and subtype Case, with a function.

/*sealed */trait Interface
case class Case extends Interface

object function {
   def apply : Interface => … = { case Case … case _ => error }
}

Without modifying the preexisting code above, below a function2 can be added to the preexisting subtype Case (and potentially to a new subtype Case2).

But a new subtype Case2 can't be added to the pattern matching of the preexisting function above. Thus a runtime typing error for the default case in function above would occur if we create the new subtype Case2 below (thus violating the static type requirement of the Expression Problem).

The runtime typing error is eliminated by declaring above sealed trait Interface, which would generate a compiler error if the case class Case2 extends Interface declaration is attempted below.

// case class Case2 extends Interface

object function2 {
   def apply : Interface => … = { case Case … /*case Case2 … */case _ => error }
}

Add Case but not Function

Given a preexisting supertype Interface and subtype Case, with a function.

trait Interface {
   def function : …
}

class Case extends Interface {
   override def function : …
}

Without modifying the preexisting code above, below subtyping can add a new subtype Case2 to the preexisting supertype Interface, and it can add a new Interface2 and function2 to the new subtype Case2 but not to the preexisting subtype Case above.

trait Interface2 {
   def function2 : …
}

class Case2 extends Interface with Interface2 {
   override def function : …
   override def function2 : …
}

Immutable Partial Solution

Given the preexisting subtype Case is an immutable algebraic type.

class Case( val field ) extends Interface {
   override def function : …
}

Without modifying the preexisting code above, below a conversion to a subtype of Case can add function2.

class CaseWithInterface2( field ) extends Case( field ) with Interface2 {
   override def function2 : …
}

implicit def convert : Case => CaseWithInterface2
   = (x) => new CaseWithInterface2( x.field )

Any function that is going to call function2 must be supplied an argument that is a subtype of an Interface with Interface2, or a Case which can be implicitly converted.

With some compiler and VM tricks, it might not even be necessary to create a copy of the fields to view a Case as a CaseWithInterface2. I have had some private discussions with Andrey Breslav about this. Basically the vtable needs to swappable and orthogonal to the data fields of the class.

However, there is no surjective mapping (i.e. see the error case below) from Interface to Interface with Interface2, because in order to map it, Interface will need to be decomposed at runtime into its infinite possible runtime concrete implementations.

implicit def convert : Interface => Interface with Interface2 =
{
   case x : Case => x
   case x : Case2 => x
   case _ => error
}

Mutable Partial Solution

Note that a mutable Case couldn't be copied to a subtype CaseWithInterface2 because the mutable state may not be referentially transparent (e.g. open file handle, closure in stored function, etc). Instead it would be wrapped in an adapter pattern class, which has the performance drawback that all methods must be doubly-dispatched. And any inheritance tree of Case is hidden in the wrapper.

As with the immutable solution, it is possible to subtype every type in the inheritance tree below and including the type that is extended with a new method (e.g. trait Exp[1]). Scala facilitates this pattern[1].

But the problem remains of how to adapt a runtime object of the preexisting type Case to the extended subtype CaseWithInterface2. Scala's implicit views don't help, because the adaption problem remains, regardless that views are also not a bijective contract.

The issue of a default case in new methods, applies because for example an Interface might not have a conversion defined for the concrete class of the object referenced at runtime, as shown in the previous section.

One strategy is that every function that wants to interopt can input the union type, e.g. of Interface ∨ (Interface with Interface2) (or the boxed Either[Interface,Interface with Interface2]), and provide a legacy functionality for Interface or return None (where None is analogous to a default case).

There is simply no way of avoiding that if there is virtual inheritance, and the family of possible subtypes is open, then there is no way to add functions that won't have a default case, and interopt fully. You will see this rule holds true for Haskell in the next section, as it gives up virtual inheritance in order to get global compile-time detection (as a compile error) of these default cases.

[1] Independently Extensible Solutions to the Expression Problem, Zenger, Odersky, section 3.3 Operation Extension.

Typeclass Solution Can't Virtually Inherit

Haskell's class (a.k.a. typeclass) and data are roughly analogous to Java's interface and class respectively. A typeclass declaration can inherit from one or more typeclass(es), e.g. (Super1 x, Super2 x) => class Sub x.

The data type specifies the immutable field(s), if any, for each named constructor and optionally includes multiple constructors. The name of each constructor is analogous to a distinct Java class. Thus data type doesn't define method functions. The method signatures are declared in the typeclass and an instance attaches a typeclass and provides the implementation for a data type.

class Interface x where
   function :: x -> …

data Case = Case field

instance Interface Case where
   function (Case field) …

Without modifying the preexisting code above, below employing an instance declaration, a new function2 can be added to a new Case2 and preexisting Case data types, declaring each to be a subtype of a new typeclass Interface2.

class (Interface x) => class Interface2 x where
   function2 :: x -> …

instance Interface2 Case where
   function2 (Case field) …

data Case2 = Case2 -- constructor with no field

instance Interface2 Case2 where
   function Case2 …
   function2 Case2 …

The new Interface2 has been inserted into the inheritance hierarchy between the preexisting supertype Interface and subtype Case, yet this typeclass pattern[2] allows for separate compilation (i.e. doesn't require recompilation of the preexisting Interface and Case), because in Haskell an interface isn't attached to the runtime object (of a data type). There is no runtime VTable dispatch and all function calls are statically resolved at compile-time.

This flexibility is available because Haskell can't do virtual inheritance.

Thus, lists elements cannot be heterogeneously subtyped, without complex constructions such as the non-extensible existential types, convoluted HList[3], or union types.

Also, for example an Ord(ering) of Int can only be done one way . An OrdDivisible can't be inserted in the inheritance hierarchy between a supertype Ord and subtype Int (to provide an alternative implementation of the methods of Ord). Thus an Int can't be an argument for a function that expects a parameter of type OrdDivisible. This breaks compositionality, which is afaik one possible reason some functions in the Haskell Prelude have boilerplate (multiple definitions) to handle specific types.


A typeclass declaration can inherit from one or more typeclass(es), e.g. (Super1 x, Super2 x) => class Sub x.

This is a compile-time constraint on the type parameter x, not runtime virtual inheritance, i.e there is no (cf. such as a ) VTable.

(Tangentially, where the => constrains a type parameter of a data type, it is roughly analogous to Scala's >: upper bound.)

To achieve this, the multiple inheritance of interface must flatten to a single implementation and not even single-inheritance of implementations.

In addition to providing the benefit of separate compilation of extension, this eliminates (compile-time and runtime) ambiguity caused by overloading, thus facilitating global type inference and the performance benefit of eliminating invokevirtual (respectively).


And a significant violation of single-point-of-truth in Haskell, is type parameter bounds for subtypes must match those of the supertype interface, i.e. the type parameter of the subtype may not have a more restrictive bound[4].

[2] Software Extension and Integration with Type Classes, Lämmel, Ostermann, section 2.4 The expression problem.

[3] Software Extension and Integration with Type Classes, Lämmel, Ostermann, section 4.4 Explicit vs. implicit subtyping.

[4] Generics of a Higher Kind, Moors, Piessens, Odersky, section 7.3 Exceeding type classes.

link|improve this answer
Notice in the Mutable Partial Solutions section, I added a tradeoff rule that says all possible solutions that bijectively interopt, must either have default cases (runtime exception for missing case) or no virtual inheritance (a compile-time error for missing case). Thus I am correcting Odersky's claim that his solution doesn't have default cases, because it does when it is used with bijective interoption. Multmethods bijectively interopt, that is why they have a default case. See also my comments to michid under my answer. – Shelby Moore III Dec 8 '11 at 8:35
In the Mutable Partial Solutions section, I suggest the alternative coding style to use a union type to lift the match default case out of the function to the input parameter type, where the supported types are thus enumerated and type checked by the compiler. This is analogous to first-class concept in Extensible Programming with First-Class Cases. Thus, my tradeoff rule holds true for that research paper. None of this eliminates the "default cases". It just pushes them to different places. – Shelby Moore III Dec 8 '11 at 12:37
feedback

Did you have a look at Wouter Swierstra's Data types à la carte (J. Funct. Program. 18(4): 423-436, 2008) for a solution in Haskell? From the abstract:

This paper describes a technique for assembling both data types and functions from isolated individual components. We also explore how the same technology can be used to combine free monads and, as a result, structure Haskell’s monolithic IO monad.

link|improve this answer
I had read it isn't symmetric. Also I linked the union in my answer to more comments from Oleq. – Shelby Moore III Dec 7 '11 at 11:54
1  
Please see my comments to michid underneath my question. – Shelby Moore III Dec 8 '11 at 1:59
feedback
up vote -2 down vote accepted

The scientific method requires that if I state a new rule, I must try to break it.

In this respect, invalid answers can be a valid activity. This answer demonstrates why my "tradeoff rule" is likely correct.

I started to write this answer, thinking I had discovered a partial exception to the "tradeoff rule" that I provided in my first answer. But if you follow this to the end, you will see I did not.

Scala can combine a Haskell-like typeclass, with virtual inheritance … well I thought almost. Start from the “Add Case but not Function” of my first answer.

class Case2 extends Interface {
   override def function : …
}

trait Interface2[T <: Interface] {  // typeclass
   def function2 : T -> …
}

implicit object `Interface2Of(Case)` extends Interface2[Case] {
   def function2 : Case -> …  // no 'override' because the implicit
}                             // object is resolved at compile-time

implicit object `Interface2Of(Case2)` extends Interface2[Case2] {
   def function2 : Case2 -> …
}

// Example use-site function, can input Case or Case2 (*plonk*!)
def test[T <: Interface](param: T)(implicit tc: Interface2[T]) =
   param.function( … ; tc.function2(param …

So this means there are no ”default cases”[1], because they are checked for at compile-time and generate compile-time errors. The caller of test must provide the Interface2[T] argument explicitly or in the implicit scope. See section 21.2 Rules for implicits of Programming in Scala for the scoping rules.

[1] The term "default case(s)" means code path(s) for which we don't have the function2 available.

So what is the caveat? Note that trait Interface2[T <: Interface] cannot be trait Interface2[+T <: Interface], because T is used in the contravariant position, i.e. the input parameter of function2.

Thus, if for example SubCase is a subtype of Case, then an implicit object must be declared for it, even if it inherits the same function2 implementation. So in that narrow sense of needing to write more boilerplate, virtual inheritance is lost. But virtual inheritance is retained in the important ways, e.g. heterogeneous lists of Interface (see my first answer).

This appears to demonstrate a unique power of Scala. The virtual inheritance tradeoff initially appears to be orthogonal and isolated to the typeclass axis, i.e. the selection of an alternate dictionary (a.k.a. vtable) only for the extended portion of the interface.

But I missed something essential, *plonk*! We may have no way to supply an Interface argument to the test function, because there might be no way to define an implicit object for Interface, because we may need to know the concrete class in order to implement function2.

Thus my "tradeoff rule" remains true. Virtual inheritance would be lost in important ways, e.g. can't apply function2 to a heterogeneous list of Interface.

Tangentially, there is an optional syntactical sugar for writing the use-site function, although I am not unequivocally convinced it is more readable.

def test[T <: Interface : Interface2](param: T) =
   param.function( … ; implicitly[Interface2[T]].function2(param …
link|improve this answer
This also demonstrates why checked variance annotations of Scala are a very important language feature, otherwise the type system is not sound, because without enforcing the invariance it could have let me do virtual inheritance where it would break at runtime. – Shelby Moore III Dec 8 '11 at 23:06
feedback

Not the answer you're looking for? Browse other questions tagged or ask your own question.