[ | smalltalk dot org ].
Community and industry meet inventing the future.
Here's to Dynamic Freedom!
written by Peter William Lount
August 11th, 2004
Version 1, 4:00am PDT
Version 2, 10:40am PDT
Version 3, 12:22pm PDT

Defining the types of variables using "static" variable typing is one of the most limited forms of data validations that are commonly used in programs. As has been pointed out, in languages like C and Java, typed variables are mandatory and force you to think about them a lot rather than paying attention to solving your problem. It's a matter of focus. It's a matter of where you spend your time. It's a matter of how valuable your time is. Typing variables takes a similar mental focus and concentration that manual garbage collection requires. One would assume that type inference, the automatic determination of a variable's type, might help, but it also limits the "dynamism" or "range of motion" of what is permitted in the language. Typed variable languages have a certain "style" of thinking that aim the programmer in a certain direction of thinking; a direction that's hard to see when one is in it, a direction that is hard to change even when one is aware of it, a direction that leads to the over use of type validations.

Let's expand a little on that. A type specification in C or Java enables you to specify a single type for a variable. Sure in Java, and other languages like Objective-C, there is inheritance so one can specify a type and it's sub classes for a variable. Typed languages introduce additional syntax for their type specifications which adds to the complexity of the language. These simple variable type validations are usually applied at compile time rather than at program run time. In this way they also are "frozen in time" and restricted to the type that was specified in the past. Dynamic type validations applied at run time can pick the type of the variable or types of the variable at run time, an option that static type checking doesn't have. This alone provides a powerful capacity for dynamism as a program can adapt to the needs in the moment while a staticly typed (or even a typed) program can't adapt without a rewrite, recompile, relink, reload, rerun to the exact moment...

"Static typing prevents certain kinds of failures. Unfortunately, it also prevents certain kinds of successes."
- Ned Batchelder

Typed variables can be thought of as railroads with a different width between the tracks - one track width for each type. In Europe each railway had it's own track width which meant that trains and train cars from different railways couldn't travel on the rails of a different railway company. The solution at first was to provide various kinds of adapter solutions that enabled trains with one axle width to travel on tracks of a different width. Of course this caused delays when the trains needed to change to the other track width. More often the cargo had to be unloaded from one train and reloaded onto another so that it could continue on it's way. In many regards this is also similar to the "relational-object" mismatch problem.

In a program the objects traveling through the program statements are like the train cars and the program variable statements are like the rails. If the variables are locked down with a specific type they become separate grooves that only allow that particular kind of object to travel along those pathways. Some think that this is a good thing which they call "type safety" because it's a simple way of organizing the information flows, much like a relational database is a simple view of the data that can be pigeon holed into restricted and limited "types".

Imagine traveling a road network that had different width tracks or groves in the road for each brand of car, truck, motor cycle, bicycle and pedestrian. It would be completely unworkable. That's exactly what a program with typed variables is like, especially the larger they get.

In Smalltalk variables are not assigned a type since the objects carry their own type information with them wherever they travel trough out a program. All variables may thus take on (or be assigned) any type of object at any time. You have at your disposal the full power of the Smalltalk programming language to implement any data and type validations that might be needed. In addition the language syntax is simpler since you don't have variable type restriction definitions all over the place cluttering up the program and your precious mind space.

In many languages such as C you are required to give each variable a type. This locks it down. In some languages, again such as C, they didn't like this restriction so they provided a means to adapt or "cast" a variable to a different type. This can lead to all sorts of problems that violate the so called "type safety" of a program and are generally considered bad programming style.

In C a variable is defined by specifying the type before it:
        int aCounter;

In Smalltalk there are a number of kinds of variables: instance variables that are part of an object and that are defined by the objects class definition; temporary variables within methods which are defined in the method near the top after the method name definition and before any program statements (although some versions of Smalltalk allow them to be within the program statements as well for better scoping locality); and then there are the method parameter variables which are defined as part of the method name definition.

The Smalltalk equilivant of the above C definition inside a method would be
        | aCounter |

The difference is that the variable "aCounter" in Smalltalk can hold any object no matter what it's kind or type. In C it's restricted by the type definition to be an Integer - and an integer of a set bit size.

In Smalltalk the aCounter variable could be assigned an integer, a fraction, a very large integer with hundreds of digits, or even a more complex object that knows about counting in your specific application domain.

For example, I wrote an segmental bridge engineering program, known as MetamericBridge(tm) that calculates the an instance of a segmental bridge down to the millimeter given the specific design templates and parameters for the bridge. Engineers used to use actual chains to measure distance and they've kept the terminology for distances along a road way. Distances of the type "Chainage" were used where the each chain is 1,000 meters. Chainage is entered and displayed as whole kilometers (1,000 meters) plus a number of meters. Chains can be added and subtracted from each other and converted to other numeric types such as centimeters as needed for calculations.
In a dynamic program without typed variables the road network is like what we have in a modern city, roads that all kinds of vehicles can travel on. There is no need for special rails or grooves on the main road networks. In general it works amazingly well and it provides freedom of motion for the cars, or objects in the program. It also does something that you can't do in a locked down typed variable program, the number of pathways that the objects can travel upon increases exponentially as the vast majority of the network is available. When needed there are one way streets and special restrictions such as buses only or the like, these are like the dynamic run time data validations that can be applied with flexibility in the cases where limitations are actually needed.

In the minds of many typed variables solve a problem in that they impose a certain kind of order upon a program, a order that in limited ways can be proven "safe". It is an illusion of safety, however. Take a look at very large systems like Microsoft's operating systems which are built upon C, C++, and now C#. Few would consider these systems "safe". In the real world of actual software systems it seems that large systems don't benefit from "typed variable type safety" as much as the proponents would wish.

In the wider scope of data validations the types of variables plays a small role that is often better done at run time and written in the language itself rather than a very limited syntax for the specification of a single type upon a variable. The advantage of writing type validations using the general language syntax is that you have the full expressive power of the language. To truly take advantage of this validation power however, the language must support a wide range of reflection capabilities to enable writing methods that can access the types of the objects in the system. Reflection itself must be a first class capability in the language, otherwise special syntaxes and compiler support are required. In addition to keeping the language design clean and much simpler you have the full expressive power of the language at your finger tips.

It's interesting and curious that simply applying a type restriction constraint to a variable can alter the style of programming, languages, tools and environments by so much. It's also very interesting that there is so much freedom to get creative, innovative and necessary work done when one isn't hampered by typed variables. In this way it's similar to benefits we gain in our society from the principle of freedom of speech verses state control of speech.

I use automatic garbage collection because it solves a mundane problem. I use dynamic untyped languages since they don't force me to think in terms of variable types, I've got enough to do!

As for type inference systems that automate the process of typing variables, they still place the type constraints upon the variables! So while they seem to be better they are a lark since you are still restricted (you just don't have to type - as in fingers on the keyboard - the variable specification).

As for languages that allow both I'm still thinking on that point of view. It seems that it just isn't pure as anytime you restrict one part of a program the whole program can suffer especially since if a component that has been typed - and thus constrained - gets used over and over again and then can't be used since it's "predefined typed variables" choke on someone's new usage scenario with objects of an unplanned type that could have worked due to polymorphism. Ick.

If I've learned anything during my twenty five plus years programming is that programs rarely take shape as planned, especially if you expect to gain any serious amount of reuse out of them! I much prefer the power and flexibility of run time type validations and the full range of data validations that can be applied at run time in the cases that they are truly necessary.

Here's to dynamic freedom!

Oh ya, thank you Alan Kay, Dan Ingalls, and the original Smalltalk team for sparing us from the straight-jackets of typed variables and the limited style of programming that they impose!


Copyright 1999-2010 by Smalltalk.org"!, All Rights Reserved.