Thinking in Types
- 12/3/2017
- ·
- #development
- #typescript
Types are an unavoidable detail of computer programming. Whether they’re set explicitly or not, data types represent the assumptions we’ve made about a program–its inputs and outputs, of course, but also the structure and behavior of internal state. Even when a legacy code base or institutional challenges prevent us from fully embracing them, thinking in terms of types will yield software that’s clearer and easier to maintain.
First, some background.
Four types
Programming languages approach typing along two axes. One axis ranges from “weak” to “strong” typing and distinguishes type systems that will not allow unexpected datatypes (strong typing) from those that will make every possible accommodation before throwing a type error (weak typing).
The other axis separates “static” and “dynamic” type-checking, where the static variety happens at compile time and dynamic somewhat later. Dynamic languages still have type errors. They just aren’t raised until they happen.
When we’re thinking in types, we’re usually thinking in terms of the strong, static sort. Many of the languages that fall into this quartile are sometimes criticized by dynamic-language aficionados for the paperwork they add, but–as we’ll see shortly–the same rules of the type system still exist whether we set them intentionally or not.
Weak types
Imagine you’re the interpreter of a generic, weakly-typed language. You’re laying low in memory, just minding your own business, when along comes a developer and does what developers do. “Hey! Interpreter! Run me this thing.”
function double (a) {
return 2 * a;
}
In our hypothetical language (which just happens to look rather like JavaScript)
“this thing” is double
, a function that will multiply its lone argument by
two. We’ll gamely run double
for any type of input we can think of, though the
result may not always be what the developer expects:
double(2) // 4
double('2') // 4
double(['2']) // 4
double('two') // NaN
Duck, duck, NaN
–a little mystery, but utterly self-inflicted. In a language
with static typing (that just happens to look rather like TypeScript), the type
signature of double
would look something like this:
declare function double (a: any): any;
In other words, the function definition gives us negligible information about
the input or return type of double
. When arbitrary data is passed in, it’s all
an interpreter can do to make an educated guess (within certain pre-existing
rules) about how it should be handled.
What we do know, however, is that double
should return the product of a
Number
(2
) and the input a
. This provides a useful clue. JavaScript
represents *
in infix notation, but
consider how multiplication would appear if written as a function:
declare function * (a: Number, b: Number): Number;
Since the *
operation will always return a Number
, the interpreter can now
fill in the missing return type in the original declaration.
declare function double (a: any): Number;
At this point, all that’s left is the small matter of deciphering a
. Here,
too, we have a clue: just like its return value, both operands of *
must be
numbers. If we provide an a
that is not a number, the compiler will attempt
to satisfy this constraint by “fixing” our mistake.
2 * 2 // (a: Number, b: Number)
2 * '2' // b: coercion (String => Number, `b = 2`)
2 * ['2'] // b: coercion (Array => String => Number, `b = 2`)
2 * 'two' // b: coercion fails (String => Number, `b => NaN`)
All of this is to say that a deep enough dive under the hood will eventually
reveal an operation where have to know a type. Processors know nothing of
String
, Array
, or Number
; to
add
we need two numbers of fixed
size. Explicit or not, types are necessary details. Weak typing simply leaves it
to the interpreter to deduce (or guess) how the puzzle-pieces ought to fit
together. If we want double
to take numbers, we have to give it numbers. If we
don’t know what to give double
, we need to be prepared for whatever mysteries
fall out. Even when they’re implicit, clear types make for easier reasoning and
fewer errors.
Strengthening weak types
But maybe we want to guarantee types anyway. In JavaScript, for example,
developers will sometimes deploy type guards to bound the range of data types
they need to worry about. If we only want to handle numbers inside double
, for
instance, there’s an easy way to enforce it:
function double (a) {
if (typeof a !== 'number') {
throw new TypeError('a is not a number!');
}
return a * a;
}
In effect, we’ve lifted one tiny section of weakly-typed code into something
resembling strong typing. Heavy-handed, perhaps, but we can now trust a
to be
numeric.
While it’s possible to shore up weak typing, doing it ourselves throughout a non-trivial codebase tends to be non-performant, exhausting, and error-prone. “Don’t fight the language,” the old axiom goes, and user-defined type checking comes perilously close.
If we want types and the context allows, maybe there’s an easier way.
Types
Contrast the mix of guesswork and defensive coding we employed in our weakly-typed language the type declaration we would give up front in a strong, statically-typed variation:
declare function double (a: number): number;
We’ll no longer be able to double('2')
without explicitly casting the string,
of course, but with this definition we can now trust that double
will only
receive for numeric arguments. As an added bonus, TypeScript is
statically-typed, meaning that all types must check out at
compile time (JavaScript, a dynamic language, will perform its sporadic checks
only at runtime).
While weak types are extremely flexible—we can double
string arguments without
casting them!—even in a toy example they can lead us to unpleasant surprises.
With static typing we’re forced to declare intent, rather than burying it away
for others to stumble across.
Type systems taken to their logical extreme can add considerable complexity with
only marginal return. But applying them pragmatically–or at least conceding
their necessity–can lessen guesswork and make software easier to maintain. That
doesn’t mean aggressively guarding the types in every. single. path of a dynamic
language, but it does mean characterizing overloaded
functions, frequent existence checks, and type coercion beneath
the input layer as smells to be addressed. Variables should be
consistent; functions should be clear; and
data inside the program should rarely concern itself with null
or undefined
.
We’ll never save ourselves from bad assumptions, of course, but we can always strive to produce simpler, safer, more reliable code. Even in dynamic, weakly-typed languages, thinking in types is a good place to start!