Sunday, July 15, 2012

Lambda calculus preference

So, why do I like lambda calculus?  Actually, that's probably kind of a hard question to answer, and if I went through the trouble I'm not sure that the process would really be beneficial for anyone.  I like it because I do?  Whatever.

Why might lambda calculus be useful?

First thing first.  In lambda calculus nothing comes from nowhere.  This is actually true in other logics as well, although the effect isn't as pronounced.  For example, in natural deduction you have axioms, which are propositions which are just assumed to be true.  They're stated with no proofs.  They seem to be coming from nowhere.  However, that's not really what a natural deduction proof means when you encounter an axiom.  What it actually means is that if you happen to be loitering around in the world and you encounter a proposition A and an inference rule A -> B, then you can assume the proposition B.  The axioms aren't things that come from nowhere, it's more like a pattern matching transformer.  You see this stuff you can get that stuff.

As far as I can tell, with most logics you don't care about the internal structures of your axioms.  So it's easier to pretend that they come from nowhere and get on with the proof.  However, with lambda calculus you do have to care somewhat about the structure of the parameters of your functions.  If you misuse a parameter, your computation could yield a totally worthless value or it might not even be possible to finish your computation.

This should mean that lambda calculus is a good medium to use in order to think about and understand causality and causal machines.  So, for example, computers are causal machines.  Does someone want you to do something with a computer?  Well does it seem like it would be possible with lambda calculus?  Then it should be possible with a computer.

More pragmatically, if you really like lambda calculus, then you won't use singletons unless it's actually happens to be a good idea for your situation.  

Secondly, lambda calculus can only have immutable data structures.  This is similar to the previous point.  If you're in a language that allows mutable data structures and references ( ie most of them and all industry languages) then a mutable data structure can be used as a hidden communication channel.  And maybe there's some situations where you want to be able to enact change to another part of your program without actually going over there and enacting change.  But most of the time this leads to difficult or impossible to understand code and bugs.  If you really like lambda calculus then you won't tend to use hidden channels of communication.

Of course you have to be careful with how you try to apply lambda calculus to programming.  Lambda calculus is turing complete (in fact it was turing complete before it was cool).  This means that you can compute any function, it does not mean you can implement any algorithm.  For example, lambda calculus doesn't have IO.  Which means you can never communicate with anything.  If you have a problem which requires you to monitor the state of some external device, lambda calculus is probably not the best tool for the job.  If your environment doesn't have garbage collection, maybe you should pass on lambda calculus.  

Overall though, today's boring industry jobs are favoring more and more environments that are garbage collected and provide first class anonymous functions.  Best practices are starting to converge on immutable data structures, pass by value semantics, and favors some type of dependency injection over some types of singletons.  I think this meshes close enough to lambda calculus to make it a useful tool at the hands of developers.

Lambda calculus

Lambda calculus is pretty simple.

λ x . x


(λ x . x) x

x x

(λ x . x x) x

That's more or less all of the types of things that you are likely to see in lambda calculus.  

Functions are the only thing in lambda calculus.  You can create functions with λ and you can call functions.  Abstraction and application.

λ x . x x 
(λ x . x x) x

Function application is left associative.  So "x x" is the function 'x' taking itself as a parameter.  "x y z" can be rewritten "(x y) z", meaning that 'x' takes 'y' as a parameter and returns another function which takes 'z' as a parameter.  If you want things to evaluate from the right you have to use parentheses.  "x (y z)", the function 'y' takes the function 'z' as a parameter and then the result is passed to the function 'x'.

Functions in lambda calculus only have one parameter, but multiple parameters can be simulated by returning multiple functions from the original one (see currying for more information about this type of thing).

(λ x . λ y . x y)

It's not a big deal to just create functions that take multiple parameter and then just assume that the real deal uses currying to achieve the same effect.

(λ x y z . x z ( y z ) )

To calculate function application you just replace the variable with the function that was defined earlier with λ.

(λ x . λ y . x y) (λ z . z) ( λ w . w )

= ( λ y . ( λ z . z ) y ) ( λ w . w )
= ( λ z . z ) ( λ w . w )
= ( λ w . w )

What happens if some of the variable names of one of the functions you are inserting is already used in your function?

( λ x y . x y ) ( λ y . y ) ( λ w . w )
= ( λ y . ( λ y . y ) y ) ... um ...

This doesn't seem right because you will get a different answer to this equation than if the second function had variable names inside of it that were different than 'y'.  This is solved with alpha conversion, which basically says that you can rename the names of function variables so that they don't conflict with another function's variables.

( λ x y . x y ) ( λ y . y ) ( λ w . w )
= ( λ y . ( λ y . y ) y ) ... quick alpha conversion 
= ( λ y . ( λ z . z ) y ) ( λ w . w )

There's also some other rules that involve greek letters.  Beta reduction is function application by rewriting a variable with its parameter value.  This is just function application that we covered before.

There's also eta conversion, which says that two equations are equivalent if their forms reduce to the same value.  This comes in handy if you want to lazily or conditionally evaluate an equation.

( λ x . z x ) = z

Also of note is the y combinator.  So far I haven't mentioned any way to globally name a function, and that's because there is no way to globally name a function.  This is a problem because it means that you can't have recursive functions because the function cannot call itself because there is no global name to call.  Lambda calculus doesn't have any other type of iteration, so things suddenly become kind of worthless unless we can find a way to pass a function into itself, so that it can call itself.

Y = λ f . ( λ x . f (x x) ) ( λ x . f (x x) )

The y combinator doesn't really work in languages that evaluate all of the parameters of a function before evaluating the function itself (however it works fine in lazy evaluated or if you are manually rewriting the function on paper).

Z = λ f . ( λ x . f ( λ y . x x y ) ) ( λ x . f ( λ y . x x y ) )

We just use the eta conversion trick we learned earlier to rewrite the y combinator to be an equivalent function, but lazily evaluate parameters fed to 'f'.  I've seen this called the z combinator in a couple of places.

Numbers can be represented with functions, called Church Numerals (Alonzo Church invented lambda calculus).

Zero = λ f . λ v . v
One = λ f . λ v . f v
Two = λ f . λ v . f ( f v )
Ten = λ f . λ v . f ( f ( f ( f ( f ( f ( f ( f ( f ( f v ) ) ) ) ) ) ) ) )

This is the same idea behind the peano axioms.  However, instead of having a number which is just a blob of data, these numbers represent calling a function on a value some number of times.  In other words the number ten is like a loop which runs ten times.

Here's an increment function:

Inc = λ number . λ f . λ v . f ( number f v )

The intuition is pretty straight forward here.  Take a number and then take the function and value that the number would have taken.  Then run the function with the original f and v.  Finally, run the result one more time through f.  Conceptually it's like you've added one more f to the series of fs that originally created the number.

After that addition isn't too difficult.

Add = λ number1 . λ number2 . number1 Inc number2

Here I'm just inserting the 'Inc' function's name instead of writing the whole thing out.  This isn't legal lambda calculus, but it is a good short cut to make the equation recognizable.  I could just replace 'Inc' with the equation I wrote above to make my Add equation legal.

But anyway, what's going on with Add?  Well, I'm running the Inc function number1 times on the value number2.  In other words, I'm adding one to the second number a number of times equal to the first number.  Let's try it with normal numbers to get the intuition.

5 + 4
= 4 + 1 + 1 + 1 + 1 + 1
= 5 + 1 + 1 + 1 + 1
= 6 + 1 + 1 + 1
= 7 + 1 + 1 
= 8 + 1 
= 9

Okay, looks good.

The decrement function is pretty crazy.  I had to look it up in order to figure it out, and even then I had to think about it for a very long time to convince myself about what it was doing and that it made sense.  There's a couple of different ways to do this, I'm picking the one I first got the hang of.

Dec = λ number . λ f . λ v . number ( (λ a . λ b . b ( a f )) (λ u . v) (λ x . x) )

The best way to figure it out is probably just to run through it... Which I'm not going to do right now because it's going to take forever.  I'll try to write a simple lambda calculus machine sometime that works via parameter rewriting and saves all of its steps.  That way I can just implement decrement and then copy and paste the output.  

There's also a bunch of other stuff that goes on with lambda calculus.  However, I don't appear to be that interested in it.  At least I like to know how the machine works, but I don't like to know the endless commentary on weird patterns that show up when the machine is actually working.  Sometimes that sort of thing is good to know because it will save you from spending your time figuring out things other people have already figured out (some of that stuff takes hundred of years for people to figure out ... well not for lambda calculus because it hasn't been around that long, but things like Fermat's last theorem come to mind).  However, most of the time it seems like reading things other people have already figured out is boring because it's like reading about how someone else had a very fun time of solving a sudoku puzzle.  I want to be the one solving the sudoku puzzle, not reading about how someone else noticed that 3 was already used in this box so the correct number had to be 9.

Wednesday, July 4, 2012

Haskell ADT

ADT stands for Algebraic Data Type.  Name's not that important here's what it is.

data Device = USB | Bluetooth | Serial

A couple of things to pay attention to. 

"data" is Haskell's keyword for creating a new ADT.  

"Device" is the name of the new ADT.  This also shows up in type signatures, but we can tackle that later.  Also note that "Device" is capitalized.  This is a Haskell thing.  Haskell wants type names capitalized.  This convention is enforced to remove some ambiguity from Haskell's syntax, which results in programs that are less verbose.

"USB", "Bluetooth", and "Serial" are all type constructors.  These things are basically functions that are specifically created in order to bring new objects into existence.  Normally, functions are supposed to be lower case, but type constructors need to be capitalized.  Same reasoning as with the type names. 

The "|" is a bit of syntax from BNFs.  It basically means choice or alternative.  The English grammar equivalent is basically a comma.

Take all of the above together and it means that we are creating a new ADT that has a type name of "Device" and has three possible members (ie USB, Bluetooth, and Serial).

ADTs aren't really that useful without pattern matching, so let's take a quick look at that.

In Haskell pattern matching refers to function dispatch based off of the characteristics of the input parameters.  Typically types of the inputs, but also values of the inputs.  So ...

funcName 4 = 5
funcName 5 = 4
funcName x = x

So, this function returns 4 when it is called with the value 5.  5 when called with the value 4.  And finally it returns whatever value it is given when it is not 4 or 5.  You can using pattern matching on arbitrary many parameters of your function.

funcName x _ = x + 1

The "_" means, "I don't care what this is and I'm not planning on using it".  You can use underscore for multiple parameters (reusing it isn't a problem).  

There's also different ways to indicate that you want to pattern match for your function.

funcName 4 = 5
| 5 = 4
| x = x

So it's a return of the "|" from the ADT.  It means basically the same sort of thing.  This or that or that.

Okay, let's do something with the Device ADT.

funcName USB = ...
funcName Bluetooth = ...
funcName Serial = ...

This is very similar to a switch statement and enum from the C programming language.

However, ADTs can be used for a lot more than as enums.

We can also create ADTs that take parameters.

data Circle = Circle Double

Notice that the type name and the type constructor are the same.  Haskell allows this.

We can now do things like:

Circle 1.0

And then we can write functions like:

area (Circle r) = pi * r ^ 2

Notice that we're using pattern matching again, but this time we are capturing the value that is used to construct a Circle.  Underscore can also be used in this type of situation.

funcName (Circle _) = print "It's a circle"

We can also use the "type" keyword to make things a bit more readable.

type Radius = Double
data Circle = Circle Radius

It's important to keep in mind that the "type" keyword doesn't perform any type checking.  It's just telling the type checker that whenever the user uses the name "Radius" they really mean "Double".  The idea is that this makes it easier on people to understand what's going on.

Finally, let's also use the "newtype" keyword.

type Radius = Double
newtype Circle = Circle Radius

I'm actually not super clear on what the deal is with newtype.  Most documentation doesn't really talk about it.  However, it appears that newtype is only usable when you have an ADT which only has a single type constructor (so for example you would not be able to use newtype with the Device ADT that we were looking at earlier).  newtype type checks parameters just like data (so it's not an alias like type is).  The benefit over the data keyword is that newtype does not affect runtime at all.  Which sort of makes sense.  With data, the runtime environment has to keep type information around so that it knows how to choose the correct choice when someone tries to pattern match a variable.  There's not an explicit if statement floating around, but one has to exist under the covers.  On the other hand, with newtype, the type checker will make sure that the user is using variables correctly, but since there's only a single choice there's no reason to carry around any type data.

You might want to do something like this:

type Radius = Double
type Side = Double

newtype Circle = Circle Radius
newtype Square = Square Side

area (Circle r) = pi * r ^ 2
area (Square s) = s ^ 2

However, this will not work in Haskell.  The problem is that the area function requires a well formed type.  Unfortunately, the type system that Haskell uses does not support merging types like this.  However, there is a way to handle this sort of thing using type classes (which I don't want to talk about yet) or existential types or a combination of both type classes and existential types.  

Okay, so we can make ADTs that take parameters which means we can do something like this:

data IntList = Cons Int IntList | Nil

Lots of new stuff here.  The "Cons" type constructor is taking two parameters and the second parameter is of type IntList, which means this thing is recursive.

Cons 5 (Cons 4 Nil)   yields a list like (5, 4, Nil)

However, the current example means that we will need a new list for each type of thing that we want to put into a list.  Right now we can only handle Ints.  And we'll also need different functions for things like length and append, which only depend on the fact that the structure is a list and not on the internal type, because we saw earlier that Haskell only allows us to have well typed functions.


There's a logic called System F that was invented by Jean-Yves Girard.  The ideas from this logic were eventually adopted into programming languages as generics.  We are able to use System F with ADT in Haskell.

We can now write our list like:

data List a = Cons a (List a) | Nil

And we can write functions that are typed "List a" and they will work on any type of list that we create.  This is sometimes called type polymorphism.

Everything in our polymorphic List looks the same except the "a".  This is a type parameter.  Normally in books you will see these represented as greek letters, but it's almost always easier to type as actual letters.

You can have more than one type parameter.

data Either a b = Right a | Left b

One more thing.  Haskell also has support for something that is very similar to structs from C.

data Frame a = Frame { header :: [Int]
, body :: [Int]
, crc :: [Int]
, misc :: a }

"header", "body", "crc", and "misc" are all basically fields.  Really there's no difference from this ADT and one that looks like:

data Frame a = Frame [Int] [Int] [Int] a

Except the first one has named parameters.  The first one provides more readability.

All of these things can be combined.  And of course you can create one ADT and use it in other ADTs.  While some people say that type checking is all about having the compiler make sure you do not make mistakes, I like to think that types are about describing your problem (the type checking is just a bonus).  With all the ADT features here, there's quite a few problems that you'll be able to describe.  Add in the power of pattern matching and you will be able to expertly craft solutions.

Tuesday, July 3, 2012


I think I initially started this blog because I wanted to communicate with people better.  At the time I was feeling very low and part of the problem seemed to be that I found difficulty in expressing myself to others.  I had spent the previous eight years intently focused on the technical details of communication.  That is, what exactly are things.  The first half of that was college.  The whole point of taking classes is to get better at describing things exactly.  After all examinations posses two tools:  Direct inqueries as to what certain things are and puzzles which are easier to solve when you have a proper understanding of what a certain thing is.  The second half was the first four years of my professional career.  Honestly, that probably should have been more focused on expressing myself and communicating with my employers and coworkers, but that's not what I made it about.  I made it about understanding programming languages.  And then later when I started to run out of new distinguishing characteristics of programming language I made it about understanding mathematics.  It's time for me to swtich the purpose of this blog from practicing expression to using expression.  

I don't particularly have anything specific I want to express.  However, I have spent a lot of time thinking about different things that are important to me.  Some of that might be useful, and I want to try and express some of it to you.

So what's important to me.

I'm a christian.  Unfortunately, that's quite an ambiguous term.  I believe that Jesus Christ is the son of God, he died for our sins, and whoever believes in him shall have everlasting life.  There's more, but this is the most important part.  I can talk about it later.

I like understanding how things work.  I like tearing things apart and comprehending their exact nature.  I've noticed that there's a couple of techinques that help me to understand how things work.

Realize that normally the way you think something works is incorrect.

Lambda calculus.  I actually kind of like to think of this as the logic of implementation.

Logic.  Logic is actually kinda big.  And it's sort of ambiguous where the boundaries are.  Classical, intuitionistic, substructural, BNF, sequent calculus, natural deduction, etc.

Typing.  Typing could actually be grouped with logic, however to me logic is simply how you ensure that typing is correctly enforced.  I think the important part is actually the ability to describe different aspects of things exactly.

Of course there's also algebras, category theory, discrete and continuous mathematics, but I'm not convinced that stuff is that useful for the way I like to understand things.  Maybe they'll show up from time to time as I figure out why I would want to use them.

I've been educated as a computer scientist, so some of that will probably show up.  And I work as a software engineer.  Also, philosophy, certain types of fiction, videogames, and programming languages.