Sunday, June 2, 2013

Homonyms by Any Other Name: Types

In college I started getting really interested in programming languages.  Somehow, I discovered Lua at approximately the same time I was going through a compilers course.  Once I got my first job, I decided that it was important to understand as many different programming languages as possible.  

Not entirely sure why I did that.  Maybe I had a vague understanding that I derived a great satisfaction from figuring out how things work.  But whatever the reason I spent a lot of time and effort figuring out as much as I could about programming languages.  

I started with huge lists.  Features that I found in different languages.  I wanted to bunch them all together.  Be able to use all of them proficiently.  But as time went by, I noticed that my lists were shrinking.  I had begun to understand the mechanisms behind different features, and I could categorize massive sections of a single language behind a single word.

Types were kind of difficult to figure out though.  What exactly were these things.  People seemed to categorize them into different categories.  Dynamic types, static types, strong types, weak types.  And people had wildly different thoughts as to which type of type was best.  And then more people complained that dynamic, static, strong, and weak were all bad categories and they didn't really correctly classify types.  It was almost like the word "type" meant different things depending on the context.  And I think that is actually exactly what is going on.

Normally, people would say that types prevent you from making mistakes.  But that's not really what happens in the C programming language.  In C you can cast anything to anything and the compiler will just assume you know what's up.  In C, types mean what kinds of assembly opcodes and offsets the compiler will generate.

Well, maybe that's just C ... in C# types prevent you from making mistakes.  Although ...

void fluAdd( IList<int> list ) { 
    list.Add( 5 );
}

fluAdd has an interface.  This allow us to know that no matter what happens, we will always be passed an object that has an Add method on it.  This protects us from failures.  Until something like this happens:

fluAdd( new List<int>().AsReadOnly() );

So what does happen if you try to add something to a readonly collection ... well you get a NotSupported exception.  Which sounds kind of familiar.  Hmm.

>> 5.add( 5 )
NoMethodError: undefined method `add' for 5:Fixnum
from (irb):1

Aahh!  Ruby's NoMethodError is basically the same thing as C#'s NotSupported exception. So C#'s types don't protect you from making mistakes.  Why they're no better than what Ruby gives us.  Well, don't go quite so far.  Notice that Ruby is letting me try to add a list entry of '5' to the number '5'.  C# won't let you do something that far off the rails.  So while it is true that C#'s types won't protect you from making any mistakes, it isn't true that you might as well have dynamic types.  C#'s types lower the number of possible mistakes you can make.

Okay, so what about Haskell.  People are always saying that Haskell has a really powerful type system and comes from real, hardcore type theory.  Well this is partially true.  Yes, Haskell uses real type theory in its types.  Type theory has this great concept called Soundness.  Basically, it boils down to being able to prove that the type of a function doesn't change while you execute it *and* every given step of the execution yields a sensible result that can continue to be executed.  From this comes the saying, "well typed programs don't get stuck".  So the issue with Ruby's NoMethodError or C#'s NotSupported exception shouldn't happen in a programming language with Sound types.  

But I mentioned "partial" truth didn't I?  The first is that Haskell's type system isn't really all that powerful, when compared to other possible candidates from type theory.  Maybe you want to type your types ... there's a type system for that.  Or maybe you want to totally specify every aspect of your program that is mathematically possible ... there's also a type system for that.  Haskell resides in a middle ground that balances out what you can make safe with types and what is practical to make safe with types.  But there are extensions to Haskell and different languages which improve on what you can make safe by using types.

The second is that Haskell also has Type Classes.  Type Classes don't really have much to do with type theory at all.  But they do allow you to specify functionality based on what a type happens to be.  Additionally, Recursive types (totally a type theory thing) are expressed with Algebraic Data Types (not necessarily a type theory thing[1]).  Then comes in Generalized Algebraic Data Types, existential types, universal quantification, polymorphic kinds, type families, functional dependencies, the list goes on.  Even if you want to ground a programming language completely in type theory, you still have to decide how to implement that and how the programmer will be interfacing with the types.  This means that if all you know is type theory, you will still need to understand a type system of a particular programming language in order to truly understand that types are.

Of course all this really only focuses on static type systems.  Even in the dynamically typed world, the semantics of type isn't this one monolithic thing that is the same in all contexts.  Consider Ruby:

def ikky( a )
    a
end

ikky( 1, 2 )

=> ArgumentError: wrong number of arguments (2 for 1)

So, there aren't any compile time checks to make sure ikky is containing the correct types, but there are runtime checks to make sure that at least the correct number of arguments are being passed to ikky.

Now consider Lua:

function ikky( a )
    return a
end

ikky( 1, 2, 3, 4 )

=> 1

In Lua, you can pass as many arguments as you want to a function, and it will totally execute the function regardless of how the argument count matches up with the function definition.  You can even pass no arguments to a function that takes many arguments and everything will work fine[2].  

So just because you decide to move to the dynamically typed world, doesn't mean you can ignore what "Type" actually means.  The meaning of this word still depends significantly on where you are.

Whether you want to learn more about types, win the next language "discussion" you take part of, or write good software that works, make sure that you remember that word "type" depends a lot on the context.

[1] - Okay, so yeah, ADTs are totally a theoretical thing which has mathematical properties, etc.  But the fact of the matter is that you can have a recursively typed, system F lambda calculus that covers all of the things you will see with ADTs, but still not actually know how to use ADTs.  ADTs are a language level interface with the type checker that may have language specific concerns.

[2] - This is of course for varying definitions of the word 'fine'.  If a parameter doesn't have an argument, then it gets set to nil.  Which may or may not be a problem depending on how the function is written.