Thursday, May 23, 2013

Structural Typing

Scala tries very hard to feel like a dynamic language despite the fact that it is rooted in the statically-typed Java Virtual Machine (JVM). One of the ways in which it does so is through a language feature called structural typing. If you have ever used Python, Ruby, JavaScript, or a number of other languages, you are probably familiar with duck typing; that is, "[if it] walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck." Because these languages are dynamically-typed, you can write code that calls any method on an object, and at run-time the code can potentially crash if it turns out that the object does not have that method (i.e. it is not a duck). The beauty of the statically-typed world is that this is (nearly) impossible because the compiler will ensure that everything checks out before allowing you to continue. The drawback, then, is that even if you know an object has a particular method, unless its type at compile-time has that method you cannot call it. Structural typing is a way of alleviating this restriction.

Let's look at a quick example with two Java classes: PrintStream and PrintWriter. As it turns out, they both have the methods println(String s) and flush(). Unfortunately, they share no common superclass or interface that has these methods, so if we wanted to write a method that relies only on an object's ability to print and flush, we would need two separate such methods, one for PrintStream and one for PrintWriter. The pure Java solution is to add an interface with  println(String s) and flush() and make both PrintWriter and PrintStream implement them; that would work if you owned both classes or could modify the core Java libraries, but for most people that is not an option. Structural typing is essentially a way to emulate that behavior without actually modifying the classes. Let's see how this looks:

So we define a new "type" called Printer, which has exactly those two methods on it. Then we can write the printAndFlush() method that takes in a Printer and knows at compile-time that both the print and flush methods must exist on any object that is passed in. At the end we call printAndFlush() once with a PrintStream and once with a PrintWriter, achieving our goal. Again, the important thing to realize is that type safety is still guaranteed at compile-time, i.e. that any type passed in does indeed have those two methods, so it impossible for printAndFlush() to throw an exception saying that one of the methods does not exist.

An interesting question is how Scala manages to make structural types work in the JVM, which is based on a purely nominal type system (like Java itself). The method signature must have a named type for its parameter where no such type exists, so what do you do? The answer is a little underwhelming: it turns out that the Scala compiler throws away the structural type information after verifying type safety and the method signature in the bytecode takes in an Object. That means that the actual method call is dispatched via reflection and thus has a number of problems ranging from performance to reliability, which is quite disappointing. Fortunately, Java 7 has a new feature known as invokedynamic which is designed to help with dynamic method invocations much like those introduced by structural types, so we can hope that in the future reflection will no longer be needed to support this feature. Scala and other JVM-based languages really push the limits of what the JVM can do, and it's great to see that it is evolving to meet these challenges; one can imagine that someday a language which is not Java becomes the primary interface to the JVM, leveraging the decades of hard work that have gone into it.

No comments:

Post a Comment