Static Typing And Interoperability

Static typing inhibits interoperability. It does so between class libraries and their clients, and also between programming languages.

The reason is simply because static typing violates encapsulation in general, and improperly propagates constraints that aren’t justifiable as architectural, design or implementation requirements in particular. It does so either by binding too early, or by binding (requiring) more than it should.

The reason that static typing violates encapsulation is because it fails to permit the object itself from being the sole arbiter of its own behavior, and the sole arbiter of the semantics of any operations that may be applied to it. It does that by forcing the programmer to break the veil of encapsulation of the object by directly referencing its class or type as the static type constraint of variables that will be permitted to contain objects of that type, instead of respecting the privacy of that information, which should remain hidden behind the wall of encapsulation that objects are supposed to provide. And then that information is used by the compiler to force the behavior of objects based on the static type constraint of the variables that reference them, as well as the semantics of the operations applied to them, thus violating encapsulation even more profoundly.

As usually implemented by widely-used programming languages, and therefore as typically used by program developers, static typing confuses the distinction between a class and the type that the class implements. And even in an ideal implementation in some language that almost no one will ever actually use in anger, it violates encapsulation by confusing the variables or parameters that reference a value with the class of the referenced value, the type of the referenced value, or (usually) both.

A class is not a type. It’s an implementation of a type.  Therefore, any system built on that false assumption is intrinsically broken at a very deep level. If you write code based on the idea that classes are types, you need to transform your thinking.

And a variable is not the value it references. Therefore, any system built on that false assumption is intrinsically broken at a very deep level.  The semantics of a variable is the architectural role(s) it plays in the algorithm that uses it. It is not the class of the value of the variable that the variable may be referencing at any particular time, nor the type that that class may be implementing at that particular time (the type that a class implements is not necessarily a static property of the class.)

As a concrete example, consider the usage of an array object in a statically-typed language which must hold objects of different types (or different classes, it makes no difference in this case,) and therefore must use Object as the static type constraint for the array’s element type (assuming there is no less-general common superclass.) Putting elements into such an array is easy, and raises no issues.  However, although it is equally easy to fetch elements from such an array, using the retrieved values for other than the most trivial purposes typically requires the programmer to over-constrain the retrieved value: The programmer is forced to constrain such values to being instances of some predefined type (which often also means constraining them to be instances of some predefined class,) before such values can be used in application-specific or domain-specific ways. The programmer must impose such constraints by casting the value to a specific type or class–which will fail if the programmer misjudges the type of the object, even though there would have been no failure had the programmer been allowed to use the value in the operations that were needed without having to cast the value to some specific type (because those operations would be valid for all the elements of the array, regardless of their type or class.)

Typically, one only needs to send one or two messages to values retrieved from a variable or from a collection. One usually does not need to send every message (or apply every operation) defined by a particular type or class. Therefore, requiring that a value must be an instance of a particular type or class is almost always requiring more than is necessary–and usually, far more. Imposing such unnecessary constraints inhibits interoperability because it reduces the generality of the code, reduces the generality of the entities the code defines, and reduces the generality of the the entities that the code can make use of.

And the violation of encapsulation bubbles up to higher-level entities: Classes, modules, packages, assemblies, class libraries and entire frameworks–which is how it inhibits interoperability, not only between class libraries or application frameworks and their clients, but even between different programming languages.

Static typing dramatically increases the coupling between components, because it not only exposes the specific entities that define and/or implement all the behavior of a value, it also requires that clients use precisely and only those specific defining or implementing entities, and typically prohibits the use of functionally/semantically equivalent alternatives that would have worked as desired. This is true even when interfaces are pervasively used as static type constraints (which is rarely the case in practice, and so should in all fairness be dismissed as a moot point,) and because clients must in any case use only the specific interfaces used and/or defined by the component providing the service they’d like to use.

Pervasively using interfaces as static type constraints, although it definitely helps, nevertheless remains a problem a) because interfaces typically require behavior that just is not needed every time one of their instances is used (which is almost always the case,) b) because interfaces defined by third parties–and the classes and methods that use them–generally cannot be changed in any way by their clients, and c) because it is possible (and even likely) that the various providers of “reusable” (ahem) components will separately define what are conceptually “the same” interfaces for the same purposes, which nevertheless won’t be type-compatible even if they have exactly the same names and require precisely all the same operations (messages, operators, etc.)

Why? Because “reusable” components from independent sources that use what are conceptually “the same” interfaces nevertheless cannot easily interoperate with each other, since two (or more!) interfaces that aren’t defined in the same namespace will be classified by the static type system as different, incompatible types, no matter how identical they are otherwise. That means that, for example, the function provided by class library A from source Alpha that transforms a video stream by applying a user-supplied function to the color of each pixel cannot be used with the function that performs the desired color transformation provided by class library B from source Beta, because the two class libraries each use their own “Color” interface, which is not type compatible in spite of having the exact same name and defining the exact same set of required operations with the exact same semantics.

And that’s a short–and incomplete–example of why and how static typing substantially reduces a programmer’s degree of freedom in combining and reusing code components from different, independent sources, relative to what it could be if programming languages used the open-world assumption instead of the closed-world assumption regarding which operations are valid. Operations should be assumed valid until that assumption is proven false, for reasons that are analogous to those that justify the assumption that a person accused of a crime is innocent until proven guilty, or those that motivate the assumption that the null hypothesis is true until proven false by actual experimental observation.

Most of human progress over the past few centuries is directly attributable to discarding the prior obsession with being able to present absolute proofs, and replacing that paradigm with what we now know and love as the scientific method, which is based on falsifiability instead of being based on provability. Our modern technology only exists because we gave up the idea that our models of the world must be provably correct, and instead started using the paradigm of falsifiability, as embodied by the scientific method.

The programming world is in severe need of a Copernican Revolution so that we can discard the ever-more complicated epicycles and contortions that the static typists go through to try to match the generality, reusability and productivity of dynamic programming languages.

Note: You can vote on this essay on Reddit

Advertisements

One response to “Static Typing And Interoperability

  1. In static typed systems the variables carry the type information not the objects (well if the objects carry their type information that doesn’t prevail, the variables type information is what prevails or controls).

    In a dynamic typed system the objects carry the type information.

    This is a profound difference.

    One effect of which is that the variables in a static typed language know ahead of time, at compile time, what the “type” of object will be which violates the object’s encapsulation of information. All operations on a particular variable are thus limited to the particular type of the variable.

    The violation of encapsulation occurs because you have to take an aspect of the object contained by a variable and hoist it out of the object to stick it permanently upon the variable.

    In a dynamic language the type information is carried with the object itself and variables can have any type (or more accurately any class). The method will operate fine without errors as long as whatever object is in the variable understands the message(s) sent to it. Methods tend to be shorter as a result as they are no “type” specifications upon the variables, parameters, and return value.

    Control of what “classes” of objects that can be in a variable, parameter or return value is easily controlled by data 101 verification code. Type or more accurately class checking is a simplistic kind of data 101 verification, and most often focuses on the wrong aspect of an object in a variable or parameter.

    Statically typed systems are like going to a car wash with your vehicle and finding hundreds of lanes, one for each make and model of car. A Ford pickup must go in this lane due to it’s type, a GMC Safari couldn’t possibly use that same lane because it’s type is different. The list goes on with a combinatorial explosion of car wash lanes, one for each “type” of vehicle, or or “methods” each written for one type of vehicle. The key point is that what should be the encapsulated “type” of each vehicle is forced to be de-encapsulated, extracted and attached to it’s own lane (variable) causing the combinatorial explosion of specific methods (or case statements).

    A dynamic car wash on the other hand isn’t concerned with the “type” of your car so much as the dimensions, width, height, length, well base, closed windows, roof up on convertibles, nothing that will catch on the washing equipment, etc…. The dynamic car wash doesn’t care what type the vehicle is as long as it meets these practical criteria (e.g. the object responds to a particular subset of messages aka a set of untyped method signatures aka a certain protocol).

    For example, a list object is passed in as a parameter and everything will be fine if when it’s sent the “at:put:” and “at:” (and possibly variations) messages to set and get elements of the list by the method.

    This means that you can send in a simple list and access the elements using “at:” and set new elements using “at:put:”. It doesn’t matter if the collection is a simple fixed size Array, an OrderedCollection of variable size, a Dictionary with keys and values, or even a SortedCollection where the elements are sorted in some manner. As long as the messages sent to the object are understood the method can take any type of object that implements this protocol.

    Yes it’s possible that a message will have a different semantic meaning on various objects that receive any given message, that however is true for static typed languages and is a general problem across all computer languages that use words or symbols to describe semantic operations.

    The combinatorial explosion is supposed to be dealt with by generics and in some cases that might limit having to write lots of the variations by hand, however generics usage still requires violating encapsulation as the encapsulated “type” (really class) information must be extracted out of the target object and placed upon the variables of a method as a straight jacket is placed upon a person with mental health issues.

    The cost of this kind of alleged “type safety” is too great to the architecture of a program. Sure people make programs in such languages (and I have and on occasion still do) but it’s highly undesirable just as wearing a straight jacket is be undesirable.

    In summary:

    Static types (actually class) constraints on variable names and method input or return parameters are a very simplistic kind of constraint and are usually the wrong focus as illustrated by the data verification tests that are applied to variables in the alleged type safe languages.

    This simplistic “type” restriction on variables and parameters inherently forces the violation of the object oriented principle of encapsulation resulting in the breaking of the object oriented model that then spreads the object’s type (actually class) information all across the program where ever such objects are needed to travel.

    Static type restrictions extract the “type” (actually class) information and create dedicated “lanes” or “rail tracks” that can only take a certain “type” (actually class) of objects. It’s like each “type” (or class) of object is a train car and has a different width between the wheels requiring a different track width unique to it’s type (actually class). This results in an explosion of specific code to deal with all these combinations, generics just covers this growing mess of different “type” (class) specific tracks going all over the place that the runtime has to deal with.

    Without encapsulation you don’t have really have the potential for stand alone objects.

    The main point I want to add to the conversation is that forced static type specifications (really in most cases class restriction specifications) are the wrong thing a method should usually be focused upon as in real world code the type or class of an object isn’t as important as the object’s other attributes.

    For example you want a value between zero and one hundred to represent a percentage. In a static typed language you must specify the “type” of the variable to constrain it to a certain type of number, say an unsigned 32 bit integer. You then have program statements, typically if statements or case statements or the like, that test the value of the variable to ensure that it’s equal or greater than zero and equal to or less than one hundred. However you don’t really care that it’s an unsigned integer (and that it’s within a certain range), all it really needs to be is any kind of number that has enough precision to handle the range so really the “type” constraint is simply any object that is a number that can represent zero to one hundred. So really it’s “type” is best described as “number” rather than an unsigned 32 bit integer or float or double. Also note that the 2nd half of the constraint for the variable isn’t handled at all by the “type safety” system of static languages such as C#, that is it’s numeric range.

    Static type languages:

    (a) inherently break encapsulation by forcing the “type” (actually class) information to be revealed in methods external to the object;

    (b) they force objects to travel along fixed type (class) specific “tracks” (typed method signatures and variables);

    (c) they are simplistic focusing merely on the “type” (actually the class) rather than the wider range of object data validation constraints that real world programs face;

    (d) they promote combinatorial explosions of specific versions of methods based upon the types (classes) even if it’s under the covers using “generics”;

    (e) they allege “type safety” when in fact it’s a false promise as they don’t provide full type safety due to the other data validation constraints that methods have (testing for NULL is required or a runtime exception occurs, requirement of virtual methods to avoid semantic confusion (see discussion of “foo” in one of the earlier comments), and others);

    (f) their alleged type safety benefits are nullified due to the NULL pointer type value being permitted (some type safe languages allow you to eliminate null from being possible by forcing initialization of a variable) shows that more than one type can be in a statically typed variable (in most statically typed languages) in violation of the variable’s type constraint thus the type safe system is actually broken from the get go (in languages that permit this);

    (f) static type specifications typically dramatically over state and thus over constrain what the object type in the variable actually needs to be, (see “list object” example above where the incoming list object only needs implement the “at:”, “at:put:” and “number of elements” protocol set of methods), usually only a small subset of methods actually need be required for an object in a variable in a method.

    Yes in dynamic languages you have to use some defensive programming but it’s basic data verification 101 that also has to be done in the statically typed languages, protection against null values and other data verification checks. Yes, in dynamic languages it’s often prudent to check that a variable has an object that “responds to” a message or set of messages that you’re about to send it and that is just part of the data verification in those cases. You can even check that a variable contains an object of a certain class or “kind of class” or is one of a certain set of classes if you wish. It’s quite flexible and allows for covering all cases of data verification not just the “type” (or class) of an object. Or you can let a message be sent to an object and allow a “does not understand” message to be generated if the object in the variable doesn’t implement a message you sent it. Exception handling isn’t avoidable in static typed languages either.

    Much of the alleged type safety benefits of statically typed languages can be eliminated by a powerful development time Integrated Development Environment with innovations such as “dynamic runtime type inference” where the classes of objects that each variable actually ends of containing can be dynamically collected, analyzed and the information provided to the user and the compiler (see The Self Language) to enhance dynamic language performance), not to mention unit testing, regression testing which is required for statically typed languages.

    As a result of the key forced violation of encapsulation one seriously wonders if statically typed languages really are object oriented at all. Violation of encapsulation really is an egregious violation indeed and at a questionable cost of static rigidity and combinatorial explosion of the source code. The worst hidden cost though is that the very nature of the programming results in programs which typically have radically different architectures; depending on the application domain this can have profound consequences to the ability of the users to get their work done and how rigid a program ends up being, as well as the cost to maintain and grow it.

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s