Friday, December 27, 2013

Figuring out how Category Theory works

People occasionally ask me what I do.  Kind of a weird question really.  Like, I'm pretty big on breathing.  I try to keep on doing it *all* the time.  So statistically speaking the best answer should be something about that.  But that's not what they mean when they ask me what I do.  They want to know about my job.  Again, this seems kind of like a strange question to me, but perhaps it's just a safe question.  I imagine most people have something to say about their job, so people have learned that if you have to engage in conversation with a stranger that you might try asking them what they do.

When I answer this question, I tell a story about developing desktop applications that communicate and process data from a wide range of devices, typically medical devices.  This provides a satisfactory direction for the conversation to travel in.  Or at least it seems to.

But this isn't really representative of what I do.  It answers the intended question, but ignores the literal question.  What I do, is figure out how things work.

I've spent quite a bit of time figuring out how programming languages work.  And I'm not just talking about compilers.  I've also been very interested in understanding the meaning behind different programming language constructs.  A great example is the various flavors of macros from various flavors of Lisp.  It's not entirely obvious what a macro is, and at the same time there is no shortage of talk concerning "macros" when you start to look into them.  It takes awhile to sort everything out when your upbringing involves a bunch of plain C with a helping of "don't use macros; they're bad for you."

Anyway, eventually I bumped into Haskell and at this point monads and category theory reared it's … well I wouldn't say "ugly", but definitely mysterious and mostly incomprehensible head.  If you've been present in the online, software engineering, y-combinator-ish [1] crowd then by now you've figured out what kind of blog post this is.  This is a "monads are like …" article.  Don't panic though, I suspect things will be a little bit different here.

Monads are like abstract mathematical objects with some properties

Okay, so the heading here isn't quite accurate.  Monads aren't *like* abstract mathematical objects with some properties; monads *are* abstract mathematical objects with some properties.

Here's the thing.  I noticed at some point that mathematicians don't seem to be interested in the same things that software engineers are interested in.  This of course makes sense because if it wasn't true then we wouldn't need different words for mathematicians and software engineers; they would be the same profession.  

Mathematicians seem very interested in taking abstract objects, describing their properties, and then proving different things that you can do with these objects.  Software engineers are interested in generating code that solves problems.

With this in mind, consider where Category theory comes from.  It starts with abstract algebra[2].  Abstract algebra is more or less the properties you see on number systems, but you break things down to be more simple or you generalize to handle more cases.  The result is a bunch of abstract objects that have certain properties.  Move forward by proving new statements.  Kind of an exploration of things that don't concretely exist.  Group theory and ring theory both seem to be rather big.  The extended study of two different kinds of abstract objects.

Next check out algebraic topology[3].  I'm not entirely sure how to introduce this field.  It's kind of like the algebraic study of shapes, but the shape part doesn't matter so much … at least how muggles normally think of the concept of shapes.  Anyway, there's a bunch of fields within algebraic topology.  Knot theory, homotopy, low dimensional algebraic topology, etc.  There's plenty to study.

This is all the origin of category theory[4].  If you don't really want to investigate category theory any further, then perhaps you don't need to think much about group theory or knot theory.  But if you do check out category theory in more depth (like maybe by reading this book [5]), then you're going to find a *lot* of references to these fields (and other's not mentioned here).  

Category theory deals with abstracting patterns found in other mathematical fields.  You'll see a lot of concepts concerning groups and rings and sets, but ultimately I believe the motivation was to examine different concepts in algebraic topology.  And in the end category theory starts examining itself as an object and things get kind of interesting.

Which brings us back nearly to where we started.  Mathematicians have their own job and it's not the same job as programmers or software engineers or computer scientists.  So what does category theory (and by extension monads) have to do with writing software?  Well for the most part it doesn't.

At least until you take a look at type theory.

Types are Categories

It turns out that if you have a type system like Hindly-Milner[6] or perhaps more formally System F [7] (although I think anything on the Lambda Cube works out [8]).  The type system fits the definition of a category in Category Theory.  

This means that theorems that you can work out in Category theory hold true for certain programming constructs when the proper type system is attached.  

So technically you don't need Category theory.  There's more than one way to prove that something is true.  Most computer programs don't need the full generality that Category theory provides.  You really only need to prove that your function works correctly for a relatively small number of possible inputs (at least most of the time).  But if you do have a proof in Category theory and your type system qualifies, then why not use the results.

Monads themselves are useful not because they are *like* any particular thing, but because there's a bunch of proofs associated with them.  So if you follow all of the axioms associated with a monad, then you get all of the assurances of one for free.  And for a pure functional language like Haskell[9], you can use monads to represent effects (like IO or changing state) and your type system doesn't go crazy.  More specifically, I think you're able to use results from Kleisli Categories [10] (which you can use to get the corresponding monad) to ensure that functions that go from A to B to C still type check when they instead go from IO A to IO B to IO C.  And in this instance IO <whatever> just means a function that has an IO effect embedded inside of it somewhere and it returns something of type <whatever>.

Is that all?

Okay, so you can use Category theory to make sure your Haskell type system works correctly.  If that's it, then can we just ignore it?  LIke assuming we stay away from Haskell.

Well, for the most part, sure why not.  But unfortunately, you have to watch out for two things.  The first one is relatively simple.  You have to be able to refute people who try to sell you snake oil because it runs on the power of monads and categories and complicated math stuff.  The second is a little more complicated.

It turns out people are doing crazy things with category theory.  Like, they're getting Haskell to go faster than C [11].  This is kind of a big deal because common sense tells you that high level language means it's easier to program, but slower than low level languages.  

I *think* the reason that it works in Haskell is because someone figured out that you can get some surprisingly powerful computations to fit into a sub-turing complete system by modeling your recursive data structures non-recursively using F-Algebras [12].  Then you feed the whole thing to your processor's SSE [13] instructions.

Again, you could probably make this work without the Category theory proofs, but it is a bit suspicious that the guys using Category theory got there first.

The downside is that you can't just assume everyone using Category theory to advertise their product is a crank.  The upside is that you don't have to think of monads as shoeboxes anymore.  

[5] - Categories for the Working Mathematician; ISBN 978-1-4419-3123-8

Exploiting Vector Instructions with Generalized Stream Fusion