Copute is open source and designed to be the next mainstream programming language.

In addition to being very simple to grasp, the next popular language must remove the limitations on composition of modules, i.e. grouping parts of programs together to make new programs, so there can be a maximum exchange and reuse of effort, because the internet has removed the barriers to interaction. Reed's Law states that paradigms which allow 2n groupings (e.g. where n is number of programming modules), grow at 2n-n-1, which is exponentially faster than the transaction model at n(n-1)/2. Facebook's overtake of MySpace is an example.

Every decade there has been a new mainstream computer language with higher-level paradigms.

1960sbit-levelphysical hole punched in a paper card for each binary bit
1970sassembly languagenames instead of numbers (for memory addresses and instructions)
1980sCfunction, loop, variable, data type structure, implicit stack, scoping, typing, pointers
1990sC++object-oriented programming (OOP)
2000sJavagarbage collection (GC), i.e. removed pointers
2000sJavaScriptfunctional programming (FP), closures, removed typing
2010sPython?untyped mixins, untyped multi-paradigm (OOP, FP, closures), conflated rendering and syntax
2010sScala?unifies typing with optional Dynamic (i.e. un-)typed, typed mixins, typed multi-paradigm, higher-kinded, implicit type classes, pattern matching, type inference
2010sCopute?improves Scala with pure FP, explicit type classes, single-point-of-truth (SPOT), unconflates interface and implementation, simpler syntax, no exceptions, might add Typestate

There is a detailed survey of the comparative strengths and weaknesses of the well known programming languages. And a summary blog post.

+ Higher-Level Paradigms Chart

+ Achieving Reed's Law

The fork between typed (Java, C++) and untyped (JavaScript, Python, Perl, Ruby, PHP, etc.) languages in the 2000s was necessary to fulfill the exponential increase in demand for customized and mainstream programming driven by the 2n opportunities for interaction on the internet. The Java typing system was not sufficiently advanced[17] to handle this multifarious programming within the tradeoff of the 80/20 rule for both productivity and learning curve, so the only way to meet the demand for rapid and flexible design, was to forgo typing with an ad-hoc smorgasborg bandaid of unit tests, runtime exceptions, refactoring, and untyped composability programming paradigms, e.g. Ruby on Rails.

Those skimpy beelines expediently solve specific problems, but analogous to the Leaning Tower of Pisa with only 3 meters of foundation in a weak soil, they do not scale with composition, because:

Thus to scale to Reed's Law, there needs to be a sufficiently advanced typed language with purity enforcement.

+ What Would I Gain?

The following expected gains should increase productivity in the "conceptual structures of great complexity".[2]

+ Skeptical?

OOP promised some similar gains, but languages have not been sufficiently advanced to deliver on all the facets.

Dynamically (i.e. un-)typed languages have been widely thought to be a "breath of fresh air", relieving the tsuris and limitations of typing complexity, thus providing the flexibility needed to solve specific problems, and with less programmer man-hours. But only up until the required complexity, or runtime speed, of a project outstrips the weaknesses of an untyped language.

There are many reasons that a correctly designed typed language can actually achieve these gains and exceed the productivity of an untyped language.

+ Expression Problem

DEFINITION To solve the Expression Problem, a language must allow new functionality to typesafely interopt with, and without forcing recompilation nor modification of, preexisting code.

+ Untyped Languages Don't Scale

Modules in untyped languages are easy to extend without recompilation, because they place no compile time constraints on data types. At runtime these unlimited types interact in unlimited ways, thus the implicit assumptions in preexisting and new code are unlimited, i.e. nothing is local and everything is a potential externality. Thus, when new code is run, new errors due to data type assumptions can manifest both in the preexisting code, as well as the new code. In other words, untyped languages initially feel great for speed of writing new code, but they can become error-proliferation Rigor mortis as the complexity of runtime extension interactions increase. This increase is exponential (Reed's Law again), i.e. complexity blows up. Thus the effort that must applied increases exponentially, eventually falling behind the productivity that would be possible with a correctly designed typed language. The only untyped solution is to limit extension to what is anticipated, i.e. ad-hoc, simulated typing at runtime. Thus untyped languages are a beeline for solving specific problems fast, but they don't compose generally and are not generally extensible.

The Expression Problem is only solved in a typed language that provides subtyping and/or functions that can fork (e.g. overload) on data types.

  • New code that adds a subtype (e.g. Corvette) will interopt typesafely with preexisting code that operates on the supertype (e.g. Car), e.g. because a Corvette is also an Car.
  • New code that adds a function can operate on any data type, whether it be preexisting or new. To fork on data types, functions may be overloaded (i.e. functions of same name, each inputting a different data type), and/or employ pattern matching (e.g. Scala's match with case classes) within a function.

Subtyping is optimum where there is a profileration of data type specialization with unchanging set of functions (methods) that operate on data types; whereas, function forking is optimum where there is a profileration of function specialization that operate on an unchanging set of data types[3]. Extension can be a mix of both methodologies, because any types that functions operate on can be subtyped.

+ Separation-of-concerns and SPOT

Unlike Scala, Java, C++, Haskell, ML, etc., Copute does not conflate interface and implementation.

In typed OOP languages, preexisting code that interopts with the new functionality in a subtype, does so via either virtual methods and/or interfaces. Virtual methods enable the preexisting code to refer at compile-time to a supertype that has an implementation, then the subtype's implementation is invoked opaquely at runtime. This runtime override causes semantic override failure of typesafety. Referencing implementation is inherently non-extensible and Scala's type system is overly complicated because of it.

Whereas, interfaces have no default implementation to override, thus neither semantic nor inheritance implementation dependencies. Thus Copute only allows referencing interface and never implementation (i.e. class or mixin), which results in a simpler type system than Scala, while being more extensible, more typesafe, and eliminating that major genre of semantic errors.

Referencing implementation violates single-point-of-truth (a/k/a SPOT) and separation-of-concerns, because the implementation semantics are hard-wired (i.e. opaque) instead of substitutable (i.e. transparent). Allowing only interface references, is referential transparency applied to subtying. This enables general composition in the subtyping dimension, e.g. inversion-of-control, the "D" in the SOLID principle, or the Hollywood principle. See also Gilad Bracha's blog "Constructors Considered Harmful".

+ Specifics

Sometimes functions are only operating on the constructor parameters of a datatype, or even a subset of them, and not any additional methods (i.e. interface) the datatype may implement. If a function inputs the datatype, it can only input subtypes which implement all the constructor parameters and additional methods. In some languages (e.g. Haskell's instance), methods are declared orthogonally to the datatype and its constructor(s), so there is not a conceptual problem for the case where the function doesn't operate on a subset of the constructor parameters. Scala and Copute do optionally allow methods to be conflated with the datatype constructor declaration, for the benefit of conciseness and clarity. In any case, in addition to the subset of constructor parameters problem, there would also be a diamond problem with datatype multiple inheritance, because the order of constructor operations is generally not well formed (which is why mixin of Scala and Copute doesn't have constructors). In other words, if a function inputs a datatype, then another datatype could not substitute (i.e. be a subtype) for it and simultaneously other datatype(s) in general multiple inheritance. The only generalized solution is to require functions to access constructor parameters through interfaces.

Thus, Copute never allows the parameters of a class datatype constructor to be accessed by value, instead only via methods of an interface. Some boilerplate for this is avoided, contingent on module access control (e.g. private). In the case that the constructor parameter name matches the only method name of its interface type, the interface is implicitly a supertype of the class constructed and the method is implicitly generated. Another interface is implicitly created as a supertype of the class, using the name of the class prefixed with the letter C, which is a subtype of any those implicit interfaces, and contains methods (of the same name) that return the values of any remaining constructor parameters, and the methods are implicitly generated. Note this interface does not include any other interfaces the class may implement, but there is another interface implicitly created as a supertype of the class, using the same name of the class, which is a subtype of all interfaces of the class. The interface is distinguished from the class by context, i.e. in the inheritance tree a class has constructor arguments.

Copute supports the syntactical sugar pattern matching extractors (a la Scala) on any interface where the parameters are the outputs of the methods (which have no input) in the order they appear in the interface. The automatically generated extractor function, for the aforementioned interface prefixed with the letter C, has the same name as the class, even though it accesses the aforementioned interface prefixed with the letter C, because extractor functions can be overloaded on number of parameters, and _ can be used for the ignored method values, to force the extractor to access the interface with the same name as the class, i.e. without the prefixed letter C. Or id : interface can be used to require the match of desired interface.

Prefix the interface type with `, in a function signature, to force Copute to compile where only the methods of its supertype are accessed, i.e. it is intentional to make the function compose less generally than the current function implementation. Copute also performs this check for pattern match cases, and does not allow the ` override, since it could cause the aformentioned function signature check to be less general than the current implementation.

Interfaces can only contain methods, and methods (i.e. functions) can only input and output interface and function types. The class (i.e. constructable datatype) and mixin can only be referenced in the inheritance hierarchy of a class, and mixin in mixin. A class can be substituted for its supertype interface when calling a function that inputs the interface, but the function is never allowed to know which class it is operating on.

Thus only methods are referenced when referencing the name of a class in Copute, so the implementation can always be substituted, i.e. the implementation details are not conflated with its use, which is another way of saying SPOT and not the conflation of interface and implementation.

+ Examples

In the following Haskell example if a function inputs a constructed datatype RGB instead of an interface IColor, it can no longer be called (i.e. composed) with Midpoint. Even if Midpoint was a subtype of RGB, it couldn't generally multiply inherit.

class IColor c where
  red :: c a -> a
  green :: c a -> a
  blue :: c a -> a
data Color a = RGB {r, g, b :: a}
instance IColor Color where
  red (RGB r _ _) = r
  green (RGB _ g _) = g
  blue (RGB _ _ b) = b
data RGB a => Midpoint a = MidPt a a
instance IColor Midpoint where
  red (MidPt (RGB r1 _ _) (RGB r2 _ _)) = (r1 + r2) / 2
  green (MidPt (RGB _ g1 _) (RGB _ g2 _)) = (g1 + g2) / 2
  blue (MidPt (RGB _ _ b1) (RGB _ _ b2)) = (b1 + b2) / 2

In the following example a function is required to input an interface IRed, because otherwise MidpointRed can not be a subtype of RGB.

class IRed c where
  red :: c a -> a
class IGreen c where
  green :: c a -> a
class IBlue c where a
  blue :: c a -> a
class (IRed c, IGreen c, IBlue c) => IColor c
data Color a = RGB {r, g, b :: a}
instance IColor Color where
  red (RGB r _ _) = r
  green (RGB _ g _) = g
  blue (RGB _ _ b) = b
data RGB a => MidpointRed a = MidPtRed a a
instance IRed MidpointRed where
  red (MidPtRed (RGB r1 _ _) (RGB r2 _ _)) = (r1 + r2) / 2
+ Ad-hoc Polymorphism

The description of Haskell's typing system is complex, but the conclusions with respect to the Expression Problem are not. Type construction (i.e. the constructor type name and any constructor argument values as type members) is orthogonal to type methods (i.e. interface) in Haskell. Thus any preexisting constructable type can through extension be declared to implicitly inherit from any interface, but from the perspective of the interface these are just new subtypes. So in Haskell to be extensible and solve the Expression Problem, functions should input interfaces instead of constructable types, but this is not enforced, so interface and implementation (i.e. constructable types) can be conflated[4].

So in terms of the Expression Problem, Haskell is nearly equivalent in power and flexibility to a subtyped OOP program which only references interfaces and never implementation (i.e. class or mixin), except that the use of pattern matching on function inputs in Haskell is not extensible, because Haskell does pattern matches only on constructable types and not interfaces[4]. And a significant violation of SPOT in Haskell, is type parameter bounds for subtypes must match those of the supertype interface, i.e. the type parameter of the subtype may not have a more restrictive bound[6].

Note however that a subtyped OOP program which references implementation, is not as extensible as Haskell, because then preexisting data types can not be given new interfaces. If implementation is never referenced, then data types that may contain implementation (e.g. class or mixin) can be given new interfaces in terms of the Expression Problem, via multiple inheritance of the data type and the new interface in a new data type, because the supertype data type was never referenced. Thus Scala's implicit conversion functions[5][7] defeat subtyping (as a solution to the Expression Problem) because they reference implementation by always return the same constructable data type (which is the implicit substitution for the input type), for no other gain in flexibility and expressive power (when Copute's static methods for interfaces are available to emulate Haskell's typeclass interfaces, see also), as compared to an OOP program that does not reference implementation.

+ Purity

DEFINITION A pure function is one that generates no side-effects outside itself, and always outputs the same value that it first output for given input, i.e. a pure function call can be replaced by its cached value. This is also known as "referential transparency".

All uses of the word "pure" in this document and all official Copute literature will have this meaning. Note that some external literature uses the word "pure" to indicate that a language's operational semantics only allows one evaluation strategy.

+ Benefits

+ Less Runtime Errors

A typed language that solves the Expression Problem provides for compile-time composition, but does not guarantee that the composition of new code and preexisting code will not have runtime errors. One reason is because without purity, the new and preexisting impure functions may have conflicting dynamic dependencies (i.e. not revealed at compile-time) on data (i.e. state) outside the functions. With or without purity, there may also be dynamic dependencies due to inadequate fitness of the denotational semantics to the desired program behavior, but that is a separate matter (see also).

Due to the Halting Problem, i.e. because the language is Turing complete, it is mathematically impossible to enumerate these dependencies at compile-time. The extent of these conflicts is never known until the program is run forever to test for errors on all possible permutations of infinite sequences of inputs, outputs, and new code extensions.

Fortunately, with purity, the pure functions will always compose without referential dependency runtime errors (in spite of the language remaining Turing complete). Any remaining runtime errors will be due to inadequacy of the denotational semantics. Thus with purity, runtime testing is focused on the fitness of the program's denotational semantics to the desired program semantics, and the compiler eliminates the need to runtime test for the mundane typing and referential correctness.

+ Functional Composition

Also the immutability requirements of purity encourages the design of functional programs which are semantically more compositional[11], i.e. that have more degrees-of-freedom and parallelism. Pure functional programming is always declarative, and eliminates the need for imperative control statements, e.g. for, while, break, and continue, as well as mutable data operators, e.g. ++, --, +=, -=, etc..

+ Real World Programs

DEFINITION "Real world" programs have some memory, i.e. state. Examples of state include modes (e.g. insert or overwrite mode for the cursor of a text editor), output on the screen or printed page, and the cache of web page elements that enables your browser to run faster.

A program which contains only pure functions, has no state, because each pure function returns an output that depends only on its input, and causes no external side-effects. A function is not pure if it, or a function it calls, stores a value in memory that persists beyond the return of the function (even if the function never accesses it again), except in the case that this value is the returned output.

Real world programs require the impurity of stored state in the top-level functions. To compose these impure functions with pure functions whose purity is enforced by the compiler, the language must be at least minimally typed to the extent that functions are declared whether they are pure. This is accomplished in Haskell with for example the IO monad type, where the abstract type World is an input and output of impure IO action functions-- evidently impure because these action functions require an order-of-execution with respect to each other. Thus the Haskell functions that employ IO are impure, imperative and not pure, declarative.

In Copute, all functions are pure by-default, and only methods of an impure object singleton can be so declared to be impure. The purity is attached to the function's type, not to the types of the function's input and output; whereas, in Haskell the programmer must memorize which types are impure and be aware of hidden type inference. Typically programs will be compositions of top-level impure functions which store outputs from hierarchies of pure functions. In theory, although this is not yet implemented, Copute could allow that pure functions may also call impure functions in cases where the side-effects are localized within the pure function, e.g. where the impure function modifies an instance that was created within the pure function and if referenced externally, then only as the returned output.

+ Pure Functional Reactive
DEFINITION Pure functional reactive is the interaction of pure functions with state. Pure functions interact with state declaratively, meaning they react to state on input, by returning output. Pure functions can not set or change any state-- they can only return outputs.
+ Declarative vs. Imperative
DEFINITION Declarative means the code does not express an order-of-execution, and there is no memorized state because a pure function will always return the same result for the same inputs.

All pure functional programming (not just "reactive") is always declarative. An impure or pure function is a state transition from the input to the output. For compositions of pure functions, the order-of-execution of these state transitions is independent, i.e. the state transition of each function call is independent of the others, due to lack of side-effects and the replacement principle for purity. To correct a popular misconception, pure monadic composition is always declarative, in spite of the fact that Haskell's IO monad is impure and imperative.

Note that no Turing complete language (i.e. that allows recursion) is perfectly declarative, but at least the imperative side-effects of evaluation order only cause indeterminism in (i.e are categorically bounded to) the memory consumption, execution time, latency, and non-termination domains (see also).

The real world is composed of values which change in time, but a pure functional reactive program inputs real world values and declares its output, independent of the order of time. For example, a pure reactive popup menu function which has a return type of Maybe[Menu], returns a popup menu when the input mouse position is within the input onMouseOver rectangle, and returns the type None otherwise. This pure function does not depend on any state other than its input, and thus it can be generally composed with any other pure function. Contrast this with the imperative event model, in which the event callback handler function can not be generally composed, because the onMouseOver rectangle is stored state in the event callback generation module.

So the pure functional reactive model is just the pure functional model-- which is to input the global state at the top-level pure function. So the memory (e.g. that a menu is already popped up) is the singleton that stores the output of the top-level pure function. This memory is input to the top-level pure function again when some value in the global state changes externally.

+ Coinductive World
DEFINITION The world is a coinductive type, because its generative structure isn't known (i.e. we can't construct the world), and can only be observed (see also). Also note that the model of the world as a singleton with an observed substructure, and that in the category of sets, the singleton set is the final (greatest) object.

A pure functional reactive program is a composition of pure functions that may or may not input the coinductive world type but don't modify it, i.e. they don't call the observation constructor, Comonad[T] -> T. Thus the comonad method, Comonad[T] -> (Comonad[T] -> A) -> Comonad[A], enables the composition of both the impure and pure functions of a functional reactive program (see also).

But what about comparative efficiency to an imperative program? In an imperative program, the event generation module has a list of onMouseOver rectangles, and can optimize the search through these to determine which event handler callbacks to execute. With the comonad, the top-level imperative function periodically samples a new observation T, by executing Comonad[T] -> T. It then can pass the observation, into the pure function hierarchy, so each affected function in the hierarchy may test for intersection with its own onMouseOver rectangle. This is inefficient because functions are called which may not even be waiting for an onMouseOver event, and global optimizations are not available, such as subdividing the screen using a data structure tree of rectangles to optimize the search for which functions to call. There are global optimization solutions which are pure and thus declarative. For example, the functions waiting on a particular type of event can return it and these can be collected by each caller function in the hierarchy and stored in the top-level output which is stored state which is an input on the next call of the function, and then the caller function does the global optimization for the callee function hierarchy. The pure functions remain compositional, because the current state is just an input.

+ Eager vs. Lazy

DEFINITION Even with purity, no Turing complete language (i.e. that allows recursion) is perfectly declarative, because it must choose an evaluation strategy. Evaluation strategy is the relative runtime evaluation order between functions and their arguments. The evaluation strategy of functions can be strict or total, which is the same as eager or lazy respectively, because all expressions are functions. Eager means argument expressions are evaluated before their function is; whereas, lazy means argument expressions are only evaluated (once) at the runtime moment of their first use in the function.

The evaluation strategy determines a performance, determinism, debugging, and operational semantics tradeoff. For pure programs, it does not alter the denotational semantics result, because with purity, the imperative side-effects of evaluation order only cause indeterminism in (i.e are categorically bounded to) the memory consumption, execution time, latency, and non-termination domains.

Fundamentally all expressions are (composition of) functions, e.g. constants are pure functions without inputs, unary operators are pure functions with one input, binary operators are pure functions with two inputs, constructors are functions, and even control statements (e.g. if, for, while) can be modeled with functions. The order that we evaluate these functions is not defined by the syntax, e.g. f( g() ) could eagerly evaluate g then f on g's result or it could evaluate f and only lazily evaluate g when its result is needed within f.

The former (eager) is call-by-value (CBV) and the latter (lazy) is call-by-name (CBN). CBV has a variant call-by-sharing, which is prevalent in modern OOP languages such as Java, Python, Ruby, etc., where impure functions implicitly input some mutable objects by-reference. CBN has a variant call-by-need (also CBN), where function arguments are only evaluated once (which is not the same as memoizing functions). Call-by-need is nearly always used instead of call-by-name, because it is exponentially faster. Typically both variants of CBN only appear with purity, because of the dissonance between the declared function hierarchy and the runtime order-of-evaluation.

Languages typically have a default evaluation strategy, and some have a syntax to optionally force a function to be evaluated in the non-default. Languages which are eager by default, usually evaluate the boolean conjunction (a/k/a "and", &&) and disjunction (a/k/a "or", ||) operators lazily, because the second operand isn't needed in half the cases, i.e. true || anything == true and false && anything == false.

+ Trade-offs

CBV and CBN are categorical duels[10] (see also), because they have reversed evaluation order, i.e. whether the outer or inner functions respectively are evaluated first. Imagine an upside-down tree, then CBV evaluates from function tree branch tips up the branch hierarchy to the top-level function trunk; whereas, CBN evaluates from the trunk down to the branch tips. CBV doesn't have conjunctive products ("and", a/k/a categorical "products") and CBN doesn't have disjunctive coproducts ("or", a/k/a categorical "sums")[9].

+ Non-termination

At compile-time, functions can't be guaranteed to terminate.

+ Eager

With CBV but not CBN, for the conjunction of Head "and" Tail, if either Head or Tail doesn't terminate, then respectively either List( Head(), Tail() ).tail == Tail() or List( Head(), Tail() ).head == Head() is not true because the left-side doesn't, and right-side does, terminate.

Whereas, with CBN both sides terminate. Thus CBV is too eager with conjunctive products, and non-terminates (including runtime exceptions) in those cases where it isn't necessary.

+ Lazy

With CBN but not CBV, for the disjunction of 1 "or" 2, if f doesn't terminate, then List( f ? 1 : 2, 3 ).tail == (f ? List( 1, 3 ) : List( 2, 3 )).tail is not true because the left-side does, and right-side doesn't, terminate.

Whereas, with CBV both sides non-terminate so the equality test is never reached. Thus CBN is too lazy with disjunctive coproducts, and in those cases non-terminates (including runtime exceptions) after doing more work than CPV would have.

+ Performance
+ Eager

As with non-termination, CBV is too eager with conjunctive functional composition, i.e. compositional control structure (essential for functional programming[11]) does unnecessary work that isn't done with CBN. For example, CBV eagerly and unnecessarily maps the entire list to booleans, when it is composed with a fold that terminates on the first true element.

This unnecessary work is the cause of the claimed "up to" an extra log n factor in the sequential time complexity of CBV versus CBN, both with pure functions. A solution is to use functors (e.g. lists) with lazy constructors (i.e. CBV with optional lazy products, meaning optional CBN), because with CBV the eagerness incorrectness originates from the inner function. This is because products are constructive types, i.e. inductive types with an initial algebra on an initial fixpoint[9] (see also).

+ Lazy

As with non-termination, CBN is too lazy with disjunctive functional composition, i.e. coinductive finality can occur later than necessary, resulting in both unnecessary work and non-determinism of the lateness, that isn't the case with CBV[9][10] (see also). Examples of finality are state, timing, non-termination, and runtime exceptions. These are imperative side-effects, but even in a pure declarative language (e.g. Haskell), there is state in the imperative IO monad (note: not all monads are imperative!), implicit in space allocation, and timing is state relative to the imperative real world (see also). Using CBN even with optional eager coproducts (i.e. CBV), leaks "laziness" into inner coproducts, because with CBN the laziness incorrectness originates from the outer function (see the aformentioned example in the Non-termination section, where == is an outer binary operator function). This is because coproducts are bounded by finality, i.e. coinductive types with a final algebra on an final object[9] (see also).

CBN causes indeterminism in the design and debugging of functions for latency and space, the debugging of which is probably beyond the capabilities of the majority of programmers, because of the dissonance between the declared function hierarchy and the runtime order-of-evaluation. CBN pure functions evaluated with CBV, could potentially introduce previously unseen non-termination at runtime. Conversely, CBV pure functions evaluated with CBN, could potentially introduce previously unseen space and latency indeterminism at runtime. Thus CBN should be restricted to the innermost functions where the conjunctive composition performance benefit (including deforestation[11]) outweights the loss of determinism in latency and space. And the parallelism point in the next paragraph, makes more compelling that CBN should not be the default. Thus perhaps only functions with significant conjunctive compositional sequential time complexity should have both a CBV and CBN version (this could even be a runtime selection within the function), which is possible with purity. The rest could be CBV. If this works well, CBV with optional lazy products (i.e. CBN) should be the default. And perhaps the CBN versions (or even evaluating the CBV version with CBN, possibly optimized and detected automatically by a deforestation algorithm) should be selected automatically by the compiler and runtime platform.

Sequential time complexity is the same as CBV (or faster than CBV without optional CBN). Perhaps parallel time complexity is worse in the cases where the extra work in the CBV (without optional CBN) case is necessary to parallelize the map that precedes a fold. However, the sequential time performance of laziness encourages functional composition that gives rise to the parallelism opportunities (e.g. map). Note parallellism is inherently non-deterministic, thus wasteful, because it requires some speculation about conjunctive control upstream, i.e. typically 4 cores is faster, but not twice as fast, than 2 cores.

+ Higher-Level

DEFINITION The higher the level of a programming language, the more concisely, clearly, precisely, and efficiently its syntax and type system expresses and checks (at compile-time) the semantics of the program. Semantics is the meaning of the program, i.e. the requirements or model of what the program does.

As noted in the prior section on purity, runtime testing of a pure program is necessary to check for those requirements that are not expressed, checked and enforced at compile-time by the denotational semantics, i.e. the models given by the language's type system and programmer-defined types.

+ Low-Level Automation

A few historical examples of the automation of tedious, laborious lower-level semantic details, which decreased programmer error and increased the conciseness, clarity, and precision of the higher-level semantics expressed in the programs:

+ Symbols

Assembly language introduced names for memory addresses, to automate the adjustment of the address (a number), when the number of instructions in the interleaving portion of the program changed thus offsetting that address.

+ Functions and Types

C language introduced functions and types (including variables and data structures), to automate the details of accomplishing these in assembly language using the registers in the CPU and memory. In the abstract conceptualization, the other introduced keywords (even loops) were really just functions.

+ Garbage Collection

Java language introduced garbage collection (GC), to automate the allocation and de-allocation of memory for storing objects.

+ Degrees-of-Freedom

DEFINITION The more degrees-of-freedom, then the more a system can adapt to cooperate and fit to a desired solution. Imagine a car without reverse. That would be one less degree-of-freedom. The car would have to go around the block, to go backwards. That is inefficient.

With low-level issues alleviated, increases in the degrees-of-freedom correspond to (i.e. eliminating barriers to) robust compositional expression of higher-level semantics.

+ Subtyping

OOP languages long ago introduced subtyping so programs were composable in that one dimension of the two-dimensional Expression Problem. However, effectiveness was limited by the conflation of implementation and interface, which Copute corrects.

+ Exceptions Eliminated

Copute is introducing the concept that subtyping on input, not just output, arguments can eliminate exceptions, which is desirable because exceptions break purity and composability. This makes Copute total, except for non-termination due to infinite recursion. For example, in a subtyped language, we can construct a supertype Signed, which contains a signed value, then we can reflect on its type (i.e. pattern matching) for values that are determined at runtime to be Negative or Unsigned. Functions which require a Negative or Unsigned value require their caller to fork from Signed at compile-time using pattern matching (because due to covariance a Signed can't be substituted for its subtypes, and can't directly construct Negative and Unsigned since their private constructors are only available to Signed) thus eliminating the runtime exception proactively on input type.

+ Parametric Types (Generics)

OOP languages long ago introduced generics (a/k/a type parametrization or parametric polymorphism), to automate the replication of those types and functions which apply to all, or a wide range of, types and functions. However, effectiveness was limited by lack of declaration site covariance annotations, type bounds and higher-order kinds (i.e. higher-kinds), which Scala corrects. Haskell 98 had most of these advancements too, with covariance annotations unnecessary due to the immutability of Haskell's enforced purity, and contravariant substitution is not allowed in Haskell.

+ Higher-Kinded

Haskell[4] and Scala introduced higher-kinded types, which are necessary to obtain separation-of-concerns and SPOT with generics, when the language has subtyping.

A generic (a/k/a parametric polymorphic) type which has a kind of 1, is constructed from other concrete types of kind 0, e.g. List[T] is a type that is a list of any concrete type T, so a List[String] is a list of strings. A higher-kinded type is constructed from types that have a kind higher than 0, e.g. a List[List[_]] is list of lists of any type.

+ Higher-Order Functions

Functional programming (FP) introduced high-order (i.e. first-class and partial application) functions, unifying functions with types, so programs were cross-composable simultaneously in both dimensions (subtyping and function overloading) of the Expression Problem. However, effectiveness was limited by the lack of enforced purity, which Haskell corrected and Copute brings to a more traditional syntax and subtyping model.

+ Higher-Level Unification

Category theory introduced models for higher-order (a/k/a higher-level) unification of types and functions, which enabled composition of functions, that input and output types, e.g. T -> A, with higher-order types that are parametrized on those types, e.g. Sub[_].

modelfunctionimplicitly lifted tocomposes withresults
FunctorT -> ASub[T] -> Sub[A]Sub[T]Sub[A]
ApplicativeT -> B -> C
T -> A, where
A = B -> C
Sub[T -> B -> C], i.e.
Sub[T] -> Sub[B] -> Sub[C]
Sub[T], Sub[B]Sub[C]
MonadT -> AT -> Sub[A]A -> Sub[B]T -> Sub[B]
ComonadT -> ASub[T] -> A (impure)
using Sub[T] -> T
Sub[T]A or
Sub[_] is State[S,_]
T -> AT -> Sub[A]A -> Sub[B]
(S,A) -> (S,B)
T -> Sub[B]
Sub[_] is any subtype of the model, e.g. List[_]

This automates the replication of functions which apply to all, or a wide range of, higher-order types. More concisely, functions which operate on types of kind n, are automated lifted and operate on types of a kind n+1. For example, a function which operates on strings, will automatically operate on a List[String], Tree[String], etc.. There is no need to write a distinct version of the function for each possible subtype of each possible model, Sub[_]. Thus, Expression Problem-like extension was unified in the subtyping, function overloading, and parametric typing (i.e. higher-order kind) domains (see also).

However, the effectiveness has been limited by the inability to explain and implement these models simply, clearly, and concisely enough in a traditional syntax to make them popular. To implement these models requires some capabilities of Haskell's typeclass. Scala offers implicit conversion functions (see also)[4][5][7] which violates definition-site SPOT and requires boilerplate at each use-site for functions such as sequence. Copute fixes this with explicit static methods for interfaces, which results in simple, clear, concise code.

The trend of the advances above fits a generative theory, where there is an increasing resonance (i.e. fitness) between the denotational semantics and the expression required by the program's semantics, by eliminating boundaries between the meaning of the terms in the denotational semantics, i.e. unification of concepts. The physics of work explains why an increase in degrees-of-freedom, and thus efficiency of work, results from higher-level unification.

+ Physics of Work
DEFINITION The well established physics equations for work, can be correlated to the software development process to understand that efficiency to obtain programs with the best fitness to the desired semantics, is (exponentially) proportional to the degrees-of-freedom present in the compositional expression of higher-level semantics.
+ Fitness
DEFINITION Fitness is how well a particular configuration of a system fits the desired solution, e.g. how well a particular program fits the desired semantics.

For example, there would be gaps (i.e. errors in fitness) between a bicycle chain and a curved shape it is wrapped around, because the chain can only freely bend (i.e. without permanent bending) at the hinges where the links are joined. Each hinge is a degree-of-freedom, and the reciprocal of the distance between hinges is the degrees-of-freedom per unit length. Employing instead a solid, but flexible metal bar, the metal would remain fit to the curve only with a sustained force. The resisting force is a reduced degrees-of-freedom and an error in fitness. Permanent bending to eliminate the resisting force, reduces the degrees-of-freedom for future straightening some of the bend for wrapping to larger curves or straight shapes.

Higher-level semantics are analogous to adding more hinges. Cases in the higher-level semantics which don't compose, i.e. aren't unified, or where the high-level semantics don't fully express the desired semantics, are analogous to permanent bending.

+ Efficiency of Work
DEFINITION Efficiency of work is the ratio of the work output (i.e. performed) divided by the work input, i.e. the efficiency is 100% minus the work lost to friction.

The lower the friction, then less power is required to do the same work in a given period of time. For example, pushing a cart on wheels, requires much less power than to push it without wheels, or to push it uphill on wheels. The ground rubbing against the bottom of the cart, or gravity, are both forms of friction. The rubbing is analogous to the permanent bending of the metal bar in the Fitness section, because the top of the ground and the bottom of cart are permanently altered. The gravity is a form of friction known as potential energy.

Given the friction is constant, then the input power (and thus input energy) determines the rate at which work can be completed. If the type of friction is potential energy, then the more work that is performed, the greater the potential energy available to undo the work. This type of potential energy is due to the resistance forces encountered during the work to produce a particular configuration of the subject matter. For example, a compressed spring wants to push back and undo the work performed to compress it.

Since the goal is to get more configurations (i.e. programs) in the software development system with less work, then these resistance forces must be reduced, i.e. increase the degrees-of-freedom so that fitness is closer to 100%. Visualize an object held in the center of a large sphere with springs attached to the object in numerous directions to the inside wall of the sphere. These springs oppose movement of the object in numerous directions, and must be removed in order to lower the friction and increase the degrees-of-freedom. With increased degrees-of-freedom, less work is required to produce a diversity of configurations, thus less power to produce them faster. And the configuration of the subject matter which results from the work, thus decays (i.e. becomes unfit slower), because the resistance forces are smaller. Requiring less power (and work), to produce more of what is needed and faster, with a greater longevity, is thus more powerful (efficient) and explains the wisdom of a forgotten proverb.

Google can't even find a wise proverb, “if you use your power, then you lose it”. The equivalent form, “absolute power corrupts absolutely”, is probably popular because it is a common misperception that power is useful (i.e. to be idoled). The utility of power exists only because there are resistance forces, i.e. friction and inefficiency. This generative fact applies to everything. For example, the power vacuum (i.e. the need for more power to get work done) that gives rise to central authority is due to the (resisting forces caused by) vested interests, bickering, selfishness, and complacency (laziness) of the governed.

+ Knowledge
DEFINITION Knowledge is correlated to the degrees-of-freedom, because in every definition of knowledge one can think of, an increase in knowledge is an increase in degrees-of-freedom and vice versa.

Software is unique among the engineering disciplines in that it is applicable to all of them. Software is the process of increasing knowledge. Thus the most essential characteristic of software is that it does not want to be static, and that the larger the components, thus the fewer the degrees-of-freedom, and the less powerful (i.e. efficient) the software development process.

Communication redundance (i.e. amplitude) is a form of power, because its utility exists due to the friction of resistance to comprehension, i.e. due to noise mixed with the signal. The signal-to-noise ratio (SNR) depends on the degrees-of-freedom of both the sender and the receiver, because it determines the fitness (resonance) to mutual comprehension.

The difference between signal and noise, is the mutual comprehension (i.e. resonance) between the sender and the receiver, i.e. noise can become a signal or vice versa, depending on the fitness of the coupling. In physics, resonance is the lack of resistance to the change in a particular configuration of the subject matter, i.e. each resonator is a degree-of-freedom. Degrees-of-freedom is the number of potential orthogonal (independent) configurations, i.e. the ability to obtain a configuration without impacting the ability to obtain another configuration. In short, degrees-of-freedom are the configurations that don't have dependencies on each other.

Thus increasing the number of independent configurations in any system, makes the system more powerful, requiring less work (and energy and power since speed is important), to obtain diversity within the system. The second law of thermodynamics says that the universe is trending to maximum entropy (a/k/a disorder), i.e. the maximum independent configurations. Entropy (disorder) is a measure of the relative number of independent possibilities, and not some negative image of violence or mayhem.

This universal trend towards maximum independent possibilities (i.e. degrees-of-freedom, independent individuals, and maximum free market) is why Coase's theorem holds that any cost barrier (i.e. resisting force or inefficiency) that obstructs the optimum fitness will eventually fail. This is why decentralized small phenomena grow faster, because they have less dependencies and can adapt faster with less energy. Whereas, large phenomena reduce the number of independent configurations and thus require exponentially more power to grow, and eventually stagnate, rot, collapse, die, and disappear. Centralized systems have the weakness that they try to fulfill many different objectives, thus they move monolithically and can fulfill none of the objectives, e.g. a divisive political bickering with a least common denominator of spend more and more debt[16].

Thus in terms of the future, small independent phenomena are exponentially more powerful than those which are large. Saplings grow fast into trees, but trees don't grow to moon (nor to end of the universe). The bell curve and power law distributions exist because the minority is exponentially more efficient (i.e. more degrees-of-freedom and knowledge), because a perfectly equal distribution would require infinite degrees-of-freedom, the end of the universe's trend to maximum disorder, and thus a finite universe with finite knowledge. It is the mathematical antithesis of seeking knowledge to have socialism (equalitarian) desires for absolute equality, absolute truth, or perfection in any field.

The organization of matter and natural systems (e.g. even political organization) follows the exponential probabilistic relationship of entropy and the Second Law of Thermodynamics, because a linear relationship would require the existence of perfection. If the same work was required to transition from 99% to 100% (perfection) as to transition from 98% to 99%, perfection would be possible. Perfection is never possible, thus each step closer to 100% gets asymptotically more costly, so that perfection can never be reached. This is also stated in the Shannon-Nyquist sampling theorem, wherein the true reality is never known until infinite samples have been performed (and this has nothing to do with a pre-filter!). The nonexistence of perfection is another way of stating that the universe is finite in order, and infinite in disorder, i.e. breaking those larger down to infinitely smaller independent phenomena.

Copute attempts to apply these concepts to software, by increasing the independence (orthogonality) of software components, so they can be composed into diverse configurations with less work. Independent programmers can leverage independent components with less communication and refactoring work, thus Copute is about increasing cooperation.

+ Formal Models
DEFINITION Formal models (a/k/a formal methods) is a broad discipline of various kinds of semantic models that are usually (at least partially) verified and enforced at compile-time.

The aforementioned generative theory enables the ranking of formal models by the level of the terms in the concepts they model. The following are sorted by highest-level first.

+ Denotational Semantics
DEFINITION Denotational semantics are mathematical models that operate on categories, i.e. domains. They are expressed in the lower-level denotational semantics of the host language, i.e. they are hierarchy of semantics where each level transforms the lower-level syntax and semantics into a higher-level language (of the models).
+ Category Theory

Typically each of these domains is a partial-order, which defines a (inductive or coinductive respectively) type based on a initial (i.e. least) fixedpoint or final (i.e. greatest) object. Inductive types are defined by an initial algebra constructor that recurses from the initial fixpoint as its first input object, e.g. the abstract type, NaturalNumber = Zero + Succeeding NaturalNumber, is the inductive initial algebra for natural numbers and Zero is the initial fixedpoint. Coinductive types are defined by a final coalgreba constructor that recurses to the final output object, e.g. see Comonad below (see also).

+ Categorical Duality

Inductive and coinductive types are categorical duals (if they produce the same objects in reversed partial-order), because inductive and coinductive category morphism functions have reversed directions[8]. The initial fixedpoint must be the least in the partial-order, thus inductive types have objects which are the output of a unique morphism function (i.e. the algebra recursively) that inputs the initial fixedpoint. Dually, the final object must be the greatest in the partial-order, thus the coinductive types have objects which are the side-effect "output" of a unique morphism function (i.e. the coalgebra recursively), which terminates with the final object when an object of the type is destructed.

Since Monad and Comonad are categorical duals, they compose on outputs or inputs respectively.

For example, the aforementioned constructor of the natural numbers inputs an object of the natural numbers and outputs a succeeding object of the natural numbers, where Zero is the initial natural number fixpoint. The recursive algebra and coalgebra of the Monad and Comonad are the join and cojoin methods respectively. The join flattens a recursive hierarchy of Monad[Monad[T]] to a Monad[T], and the initial fixpoint is the output of constructor, T -> Monad[T] (if input is None, then the constructor is equivalent to Monoid.identity, see below), all of which is known at compile-time. Dually, the cojoin lifts the recursive hierarchy of Comonad[T] to a Comonad[Comonad[T]], and the final object (how deep the lifted hierarchy) is determined only at runtime when the object of a Comonad is destructed.

Thus inductive types model semantics with generative structure known (i.e. that can be recursively constructed) at compile-time, and coinductive types model semantics with structure that can be observed (i.e. recursively constructed) only at run-time. In layman's lingo, inductive types are things we know how to define and build from a known starting point, and coinductive types are things we know how to observe, but we don't know their internal structure.

Objects of types Monad[T] and Comonad[T], that belong to the partial-order for Monad and Comonad, are not objects of type T. The partial-order algebra and coalgebra are join and cojoin respectively, which both have nothing to do with objects of type T. Any additional partial-orders that may be present in a subtype of Monad or Comonad are due to multiple inheritance.

For example, if a subtype, Sub[T], of Monad[T] is also a subtype of the inductive type Monoid[Sub], then the subtype has an orthogonal recursive algebra Sub.Monoid.append and initial fixedpoint Sub.Monoid.identity, e.g. for subtype List[T], identity is the empty list and append is concatenation of two lists. It is important to note that the very abstract Monad inductive supertype is semantically orthogonal to other inductive supertype(s) of the subtype. The multiple inheritance of supertypes enables adding more meaning to the subtype, without conflating these meanings. This is an example of increased degrees-of-freedom.

An example of the power of this orthogonality is a subtype of the coinductive type Comonad[T], could also inherit from an orthogonal inductive type Monad[T], where the past observations could be stored. In this case, the Monad constructor, T -> Monad[T], allows None as input (e.g. constructs an empty list). Thus the subtype would be both a coinductive and an inductive type. This is possible because the method of the Comonad which constructs an observation, Comonad[T] -> T, inputs a Comonad and is impure (so it could modify its store of observations). The reason it must be impure is because even though we don't know the internal structure of the Comonad, if it doesn't always return the same observation, then it must have a side-effect in its input Comonad.

Denotation semantics can unify concepts, thus increase degrees-of-freedom for composition. For example, the category theory models outlined (here) unified Expression Problem-like extension in the subtyping, function overloading, and parametric typing domains. A goal of denotation semantics is compositionality, i.e. for the higher-level semantic terms to be orthogonal functions of their subterms. This goal is challenged to the degree that the subterms are dependent on the context where they appear in a program. Due to the Halting Problem, where the lower-level semantics is Turing complete, the subterms will never be 100% context-free. However, the context-freedom of the subterms is increased when the higher-level model unifies the meaning of lower-level subterms, i.e. increases the mutual degrees-of-freedom of the subterms.

Higher-level models require theorems that must be checked. For example, the category theory models outlined, which are programmer-defined types, each enforce their model on subtypes, via their interface (a/k/a signature) which proves some of theorems for free.

+ Axiomatic Semantics
DEFINITION Axiomatic semantics are assertions on the expected logical state of the program. The assertions are higher-level than the computational state. For example, each method that can modify an instance of a class, can be annotated with preconditions, postconditions, and invariants. The assertions can be an abstract model that is orthogonal to the state of computation, e.g. Typestate, or a sampling of the computational state, e.g. Spec#.

Thus axiomatic semantics does not unify higher-level concepts of the type system. Sampling the computational state will suffer from aliasing error. A semantic model which is orthogonal to the state of computation, is not orthogonal to the higher-level concepts expressed in the type system, because for example Typestate assertions can be replaced with types (see also).

It is the antithesis of unification and thus degrees-of-freedom, to have two simultaneous semantics (types and Typestate) for expressing the same model, which can't be unified with each other for composition. For example, Expression Problem-like extension was unified in the subtyping, function overloading, and parametric typing domains (see also).

+ Operational Semantics
DEFINITION Operational semantics are a calculus of the lowest-level (i.e. computational) state transitions of the language, e.g. the lambda calculus models the pure function logical state transitions.

Thus operational semantics is a determinant of the necessary fundamentals, e.g. whether an expression is declarative or imperative, but it does not model at a high enough level to unify higher-level concepts.

+ State-of-the-Art Languages

Some languages are sufficiently advanced to address some or most of the requirements outlined. Copute aims to improve on the weaknesses of these exceptional languages.

+ Haskell 98

Haskell 98 had implemented most of the necessary flexibility before the turn of the century. Haskell will probably however remain the most concise language in many cases, due to its global type inference, ad-hoc polymorphism (see also)[4], and other factors. Yet Haskell has some critical weaknesses, which Copute aims to correct.

  • Incomplete: No diamond multiple inheritance[4].
  • Indeterminate: Lazy by default languages, even with optional eagerness, have worse performance for space and latency, and at least for conjunctive functional composition, their sequential time complexity is not faster than referentially transparent eager with optional lazyness (see also).
  • Brittle: Possible to conflate interface and implemenation, which reduces compositional expressive degrees-of-freedom, and can also obfuscate and ambiguate semantics (see also)[4].
  • Convoluted: Higher-kinded parameter polymorphism is implicit in the relationships between declarations[4], i.e. not explicit in the parameter list of the inheritance of the supertype, which can make it difficult to understand code. For example, try to understand the use of types Id a and Accy o a as instances of the Applicative interface[12]. Compare to Copute's implementation. On the other hand, Haskell's global inferencing can reduce verbosity in complex "typing gymnastics", at least as compared to Scala and use of Scala's implicits to emulate Haskell's typeclass. It remains to be evaluated to what extent Copute can improve upon Scala's verbosity, by replacing Scala's implicits (see also)[4][5][7].
  • Obtuse: Obtuse syntax for C, C++, Java programmers.
  • Ad-hoc: Type parameter bounds for subtypes must match those of the supertype interface, i.e. the type parameter of the subtype may not have a more restrictive bound (see also)[6].
  • No contravariant substitution (see also).

+ Scala

Scala is a significant advance from other subtyped OOP languages. Scala's type system is capable of the necessary flexibility. Yet Scala has some critical weaknesses, which Copute aims to correct.

  • Indeterminate: Allows exceptions, and standard libraries have some exceptions.
  • Spaghetti: No (even optional) enforced purity for programmer defined functions, i.e. has mutable varand immutable val data but no purity type for functions (see also).
  • Brittle: Possible to conflate interface and implemenation, which reduces compositional expressive degrees-of-freedom, and can also obfuscate and ambiguate semantics (see also)[4].
  • Scattered: No SPOT (see also)[4] for declaring Haskell-like typeclasses using implicit conversion functions (see also)[4][5][7].
  • Unreadable: Easier to write than read Scala's multifarious syntax. For example, there are 6 ways to declare a function instead of one way in Copute, and Scala's inferred closures can cause bewilderment for those unfamiliar with an overloaded (DSL) syntax. Since these are bijective transformations, it is possible for an IDE to have an option to toggle renderings of Scala code between a preferred syntax style and the one encoded (in the file). This is planned for Copute. Ditto for Python/Haskell-like indented blocks versus verbose curly braces, where braces always going in the file to eliminate possible layout translation errors.
  • Unnecessary synchronization overhead for the lazy value initialization thunk when the initialization expression is pure (because Scala has no enforced purity) and its evaluation cost is relatively small relative to the synchronization overhead; unnecessary because the expression has no side effects and thus is safe in the relatively infrequent case where another thread repeats the initialization before the first initialization has completed.

+ ML Languages

ML languages had implemented some of the necessary functionality before Haskell. There are many variants, e.g. Standard ML, Caml, OCaml, etc.. Yet at least Standard ML 97 has some critical weaknesses, which Copute aims to correct.

  • Indeterminate: Allows exceptions.
  • Spaghetti: No (even optional) enforced purity for programmer defined functions, i.e. has mutable ref with immutable val by-default data but no purity type for functions (see also), only standard datatypes whose methods are pure (e.g. vector).
  • Brittle: Possible to conflate interface and implemenation, which reduces compositional expressive degrees-of-freedom, and can also obfuscate and ambiguate semantics (see also).
  • Obtuse: Obtuse syntax for C, C++, Java programmers.
  • No language support for optional laziness (see also), although it can be accomplished more verbosely with a Lazy type.

The obtuse dependently-type, total languages Agda, Epigram, and Coq are not mentioned, because they require the programmer to construct axiomatic proofs of the correctness, including but not limited to termination. Such proofs are generally not cost effective, except in mission critical applications where a single bug could be catastropic. And where there are such proofs, the language is not Turing complete, and thus not capable of expressing every possible program. The complexity of the type systems in these languages is beyond the capacity of vast majority of programmers, if not all of them who are not mathematicians. Also it is not clear that these languages implement the more fundamental separation-of-concerns (see also) that is the big advance for Copute.

+ Copute Tutorial

+ Human vs. Computer Language

A computer language contains an unlimited vocabulary of words (specifically names), and grammar rules for how these words can be combined. Computer language sentences have meaning that is understood by the computer.

Compare with a human language, that has an unlimited vocabulary of written and spoken words, which occupy a finite number of categories, e.g. noun, verb, adjective, adverb, etc.. The grammar rules specify how these categories of words can be combined to convey meaning. A computer language grammar also has a finite number of categories with an unlimited vocabulary, e.g. Copute's grammar categories are function, type, value, interface, class, and mixin.

With an understanding of the grammar rules and the categories, then an infinite number of sentence permutations can be written. In a computer language, these grammatical permutations can interact with each other, generating a complex, amalgamated meaning, i.e. a program. The interaction of these grammatical permutations in a computer language is fundamentally different from human language.

In a human language, the sentences are combined into paragraphs, which are combined into documents, but these sentences for the most part only weakly reference each other, and especially if they are far apart in the document. Whereas, in a computer language, any grammatical permutation can be a grammatical permutation of grammatical permutations (even of itself, a/k/a recursion), thus every permutation can be referenced by any other permutation regardless of mutual proximity in the program document. The analogy would be if instead of only grammatical permutations of categories of words (i.e. nouns, adverbs, etc), a human language allowed sentences to be permutations of sentences, as follows.

(Cat ate the dog) ate (Fox jumped over the moon) the (Zebra has ((Cat ate the dog) ate (Fox jumped over the moon) the (Zebra has (Cat ate the dog) ate (Fox jumped over the moon) the (Zebra has stripes)))).
Computer languages require this recursion of grammatical permutation, a/k/a Turing-completeness or computational universality, so they can compute any possible calculation or logic.

Copute's grammar categories are function, type, value, interface, class, and mixin.

+ Function

DEFINITION The fundamental grammatical unit in a computer language is the function. A function contains two grammatical categories-- its inputs and its body. The function body is a calculation (or logic) operating on those inputs, and the result is the function output.
+ Function call
DEFINITION A function call is an occurrence within a function body (the caller) of a reference to another function body (the callee), and the callee's input values are provided by the caller.

Turing-completeness requires function recursion, i.e. that a function may call itself.

+ Type

DEFINITION A type is the declaration of a function's inputs and output, and the types of those inputs and outputs. A type does not include the function body. Thus, a type is a grammatical permutation that contains grammatical permutations of type.

In Copute, the type for a function with N inputs is written:

InputType1 -> InputType2 -> ... -> InputTypeN -> OutputType

In Copute, a type can contain only (), ->, or the name of an interface.

DEFINITION An interface is a set of function types.
DEFINITION Unlike most languages, class and mixin are not types, instead are only a set of function bodies that match some interface.

This separation of interface (set of function types) and implementation (set of function bodies) is a critical advantage for Copute (see also).

+ Value
DEFINITION A type matches infinite potential function bodies, i.e. all those function bodies which have the matching inputs and output types. A value is a combination of a single type and a single occurrence of a function body which matches that type.

In Copute, the value constructor for a function with N inputs is written:

val name : InputType1 -> InputType2 -> ... -> InputTypeN -> OutputType
   = p1 p2 ... pN -> expression

  • name is the identifier that the function type on the right-side of :, and the function body on the right-side of =, is assigned to
  • p1 p2 ... pN are the names of the callee's input parameter values which correspond to the input types, InputType1-> InputType2 -> ... -> InputTypeN
  • expression is a grammatical permutation of value constructors and nested function calls, and the outer (or last) function call has an output type that matches the name's OutputType

Thus a function call is an output value, and in Copute this is written with a reference to the name of a function value, name(a1, a2, ..., aN), where a1, a2, ..., aN are the names of the caller's input argument values.

Note the names of values are the unlimited vocabulary of a computer language.

Note computer languages have syntactical "sugar" to hide that specialized types of values contain a function body. For example, a literal number, e.g. 123, is short-hand for a value that is really a function of type, () -> Int, with a body of () -> 123. Another example are the operators, e.g. the plus operator +, which is short-hand for call to a function, e.g. of type Int -> Int -> Int when used on numbers, e.g. 1 + 2.

Note due to local type inference, the type of a name's value can often be omitted and it will be inferred from the output type of the function body being assigned on the right-side of =.

+ Meaning of Computer Language

Declarative computer languages such as Copute, offer the critical advantage that values are immutable, i.e. a value's name can not be assigned to a different value. The function body of a value is always constant, but variable output values can be constructed with a function call on a value which has the function type, e.g. name(a1, a2, ..., aN), with the caller supplying different input argument values to obtain different output values. The profound implication is that function outputs depend only on the function inputs. Whereas, in an imperative computer language, the value's name can be assigned a different value, thus a function body can memorize changes and the output value does not depend only on the inputs but also the history of memorized changes.

Side note: The call of a function body in the set of functions for value which has an interface type, is written name.name2(a1, a2, ..., aN), where name2 refers to the function type to call. A value of the interface type, contains the set of the function types and a corresponding set of the function bodies.

In both declarative and imperative languages, values contain a function body, which contains an expression that can contain other values or function calls, which contain other function bodies, etc.. as a recursive spaghetti that twines out into the program. In imperative languages, the output of each function body is thus dependent on this memorized spaghetti, which is changing as the program runs. In declarative languages, the output always depends only on the caller's declared inputs at each occurrence of a function call, which is known before the program runs.

Thus in declarative languages, the meaning of each function body is completely visible (i.e. transparent) and does not change as the program runs (see also). In imperative languages, the meaning of each function body mutates as the program runs and the memorized values change, and thus the program's meaning is undecidable and is never known. In other words, a program may run forever and never execute every possible calculation of all the program's grammatical permutations (a/k/a Halting theorem). Thus imperative languages require runtime testing to sample some of the possible meanings (and errors from the intended meaning) of the program, whereas declarative languages have a visible, immutable meaning. For example, HTML is a declarative language, because each of its vocabulary acts the same, no matter where in the running of the page it appears. An img tag always displays the image data it is given, regardless of the duration that the page is displayed or reloaded.

But without JavaScript, HTML is not a general purpose programming language, i.e. it is not Turing-complete. That no mainstream general purpose computer languages are declarative is a historical boondoggle of epic proportions, which Copute aims to correct. HTML can not do general calculations, because like a human language, its grammatical permutations (i.e. "sentences") can only contain tag names (i.e. "words") and can't recurse its grammatical permutations. XHTML adds user-defined tag names, i.e. an unlimited vocabulary, but only if used with XSLT or XQuery does it have recursion and thus Turing-completeness.

The accident of history is that programmers became familiar with expressing calculations and logic in imperative languages, which involves the mutation of values as it core methodology. Whereas, the declarative model strictly separates the calculation (or logic) of mutation, from the actual values, because the caller's input argument values are independently declared from the callee's (the function body's) input parameter values.

caller : () -> CallersOutputType = ()
   callee( argument1, argument2, ... argumentN )

callee : CalleeInputType1 -> CalleeInputType2 ... -> CalleeInputTypeN -> CallersOutputType
   = parameter1 parameter2 ... parameterN
   ...function body expression here...

An imperative language can't make the function input arguments independent from the parameters, because values can be mutated, thus a callee can change a caller's argument values and a caller can change a callee's parameter values. Even if the callee was only allowed to change the values of names for the parameter names (ditto the caller for argument names), remember a value has a function body which contains an expression that can reference values by their names. An imperative language can assign different values to those nested names, which appear in both the callee's and caller's copy of the value. Thus the only way to avoid this dilemma is to make copy of all the input values on every function call, which would make programs much slower, or to require the names are immutable, i.e. make the language declarative.

In summary, function calls in a declarative language are all independent from each other. Whereas, in a imperative language, they are all dependent through a maze of memorized values that are changing as the program runs. The implications of this boondoggle and a proposed fix, are explained in detail at the Copute homepage.

The imperative programmer typically does not know that a loop for repeating some calculation using mutating values, is equivalent to using recursion of function calls that do not mutate values. And tail-recursion is equivalent in performance.

Imperative language function with a while loop to add the set of numbers from 0 to 99, which mutates the values sum and i.

function Sum0To99() : Int {
   var sum = 0
   var i = 0
   while( (i = i + 1) < 100 ) sum = sum + i
   return sum

Equivalent Copute declarative recursion.

val Sum0To99 : () -> Int {
   val recurse : Int -> Int -> Int = sum i
      val x = sum + i
      if( i < 100 ) recurse(x, i+1) else x
      // above recurse call is tail-recursive,
      // because the output of the recurse call
      // is the output of this recurse function body,
      // i.e. not an input to another function call
   recurse(0, 1)

+ Copute's Syntax

The previous sections of this tutorial introduced Copute's grammar categories: function, type, value, interface, class, and mixin, and provided the definitions of function, type, value and examples of their syntax, i.e. how to write them.

A previous section briefly defined what interface, class, and mixin are composed of, but didn't give examples of how to write and use them.

+ Interface

DEFINITION An interface has a name and is a collection of named function types without function bodies. An interface is a type, but can not construct a value, since it is lacking function bodies.
interface Name inherits OtherInterface, AnotherInterface
   name1 : InputType1_1 -> InputType1_2 -> ... -> InputType1_N -> OutputType1
   name2 : InputType2_1 -> InputType2_2 -> ... -> InputType2_N -> OutputType2
   nameN : InputTypeN_1 -> InputTypeN_2 -> ... -> InputTypeN_N -> OutputTypeN

+ Class

DEFINITION A class has a name and is a collection of named function values, optionally without function types, that correspond to named function types of some interface it inherits from. In Copute, a class is not a type, can't appear in a function type, a call to its constructor function value can appear on the right-side of = for constructing a value of the interface type it inherits from, and can be inherited by another class.

Copute's class is similar to Scala's, but with a less verbose, only one way to write a function (actually two with -> ... or { ... }), and immutable syntax as follows.

class Name inherits OtherClass, OtherMixin, AnotherMixin, OtherInterface, AnotherInterface
	: ConstructorInputType1 -> ConstructorInputType2 ... -> ConstructorInputTypeN = cp1 cp2 ... cpN
   name1 = p1_1 p1_2 ... p1_N -> expression1
   name2 = p2_1 p2_2 ... p2_N
   nameN = pN_1 pN_2 ... pN_N -> expressionN

Similar to Scala, the constructor parameters cp1 cp2 ... cpN are members of the class, but in Copute they are immutable and must match the names and output types of some interface the class inherits from. Thus their values can be accessed with function calls or pattern matching. Copute implicitly generates the implied interface for a class, if it is not explicitly listed in the inherits hierarchy.

+ Mixin

DEFINITION A mixin is a class without a constructor function value, and thus it can't appear on the right-side of = for constructing a value of the interface type it inherits from, but it can be inherited by another mixin or class. Unlike a class, multiple mixin can be inherited simultaneously, as both mixin and interface do not suffer from a diamond multiple inheritance problem, due to their lack of a constructor function.

Copute's mixin is conceptually similar to Scala's, except Scala's trait is separated into an interface and a mixin, in order to enforce the separation of interface (set of function types) and implementation (set of function bodies) (see also).

+ Advanced Topics

Copute's interface, class, and mixin can be parametrized with nearly the same syntax as Scala's parametrized types and parametrized functions. Except since Scala is an imperative language with mutable values and Copute is declarative with immutable values, thus Copute's default variance annotation is covariant. Parametrized typing is made even more powerful with the nearly the same syntax as Scala's higher-kinds. Copute replaces the with with + in Scala's compound types, and does not support refinements because structural subtyping breaks the separation of interface (set of function types) and implementation (set of function bodies) (see also).

Copute has the same syntax as Scala for tuples, except a tuple can never appear on the left-side of =. To destruct tuple, instead use pattern matching. Also Copute uses the empty tuple () to indicate both the absense of function input parameters on the right-side of a value construction, and also in the function type to indicate no inputs. In Copute the tuple containing one element is the grouping operator, except the tuple of the empty tuple (()) is the undefined type which is the supertype of all types, and which is given the name Any.

The expression in a function body can contain interface pattern matching. This is essentially the same as Scala's pattern matching, except it only works on interfaces, there is no match { ... } keyword, the case keyword is replaced with a much less verbose vertical bar |, and the => is replaced with ->. The extractor functions are automatically generated. Scala's object Name { apply... } and Scala's object Name { unapply... } are interface Name { static apply... } and interface Name { static unapply... } respectively.

+ References

  1. Verifiable Functional Purity in Java, section 2.3 Untrusted code execution.
  2. No Silver Bullet – Essence and Accident in Software Engineering, Promising Attacks on the Conceptual Essence.
  3. Are Scala's case classes a failed experiment?, see the comments.
  4. OOP in Haskell.
  5. Programming in Scala, Odersky, Spoon, Venners, chapter 21 Implicit Conversions and Parameters.
  6. Generics of a Higher Kind, Moors, Piessens, Odersky, section 7.3 Exceeding type classes.
  7. Generics of a Higher Kind, Moors, Piessens, Odersky, section 7 Leveraging Scala's implicits.
  8. Declarative Continuations and Categorical Duality, Filinski, section 1.3.1 Basic definitions.
  9. Declarative Continuations and Categorical Duality, Filinski, sections 2.2.1 Products and coproducts, 2.2.2 Terminal and initial objects, 2.5.2 CBV with lazy products, and 2.5.3 CBN with eager coproducts.
  10. Declarative Continuations and Categorical Duality, Filinski, sections 2.5.4 A comparison of CBV and CBN, and 3.6.1 CBV and CBN in the SCL.
  11. Cheap Deforestation for Non-strict Functional Languages, Gill, section 1.1 Functional languages.
  12. Applicative programming with effects, McBride, Paterson, sections 3 Traversing data structures, and 4 Monoids are phantom Applicative functors.
  13. Homesteading the Noosphere, Eric Raymond, The Hacker Milieu as Gift Culture.
  14. The Magic Cauldron, Eric Raymond, The Inverse Commons.
  15. The Magic Cauldron, Eric Raymond, The Manufacturing Delusion.
  16. Some Iron Laws of Political Economics, Eric Raymond blog.v
  17. Fighting Bit Rot with Types (Experience Report: Scala Collections), Odersky, Moors