Monday, November 19, 2012


Check out the x86-64 ABI (or I guess analogous documents, assuming they exist, for other architectures) for all sorts of pertinent information on assembly coding on x86-64 chips.  Additionally, using gcc -S blah.c will generate the assembly for your program instead of an executable.  I found this very insightful.

Here's some tips.

There are two major portions to an assembly file.  The assembly directives and the assembly opcodes.  The directives tell the assembler different things that need to be done and the opcodes are what is physically run by the CPU.  So directives will include things like .long to emit integer values into your executable and .data to indicate that the values you are emitting need to be in the data section of the executable.  The opcodes of course are more of what you expect when you think about assembly.  Move, add, call, etc.  What exactly do you want the CPU doing.

By the way, the assembler directives contain a lot of very complicated features.  It actually includes a bunch of interesting options that seem like they would be very useful if you were actually programming directly in it.  Honestly, I didn't expect to see an almost fully fledged language present when I started looking through what was available.

Beware of the .align directive.  As far as I can tell the behavior of this directive is very dependent on the set of circumstances that it is used in.  Not only is it system dependent, but it is also dependent on what type of executable you are generating.  There are two different options:  .balign and .p2align.  These directives each represent the possible behaviors that you will see with .align except they always have the same behavior.  However, supposedly they are not supported on every assembler.

For calling conventions check out the ABI document.  Reverse engineering this by hand isn't too hard (just gcc -S a function with a bunch of parameters), but the ABI made things nice and explicit.  Also, notice that variable parameter functions need to have the number of floating point parameters specified by placing that number into the %rax register.  

You might notice that some assembly code uses labels like this:

leaq    wocky(%rip), %rdi

And some assembly code looks like this:

leaq    wocky, %rdi

I think this is basically equivalent as far as what's actually going to happen when you run the code.  The difference is that one of them is position independent code and the other is position dependent code.  

Dereferencing an address held in a register is done like this:


If you need an offset from that address you can do something like:


If you need to load an address to a location, use the leaq assembly instruction (load effective address).

leaq    wocky(%rip), %rdi

This is how you could implement something like the & operator (address, not bitwise and) in the C programming language.

Calling or jumping to a function pointer is done by prefixing the value with a *.  Kind of like the following:

call    *%rax
jmp    *%rax

Or if the address is somewhere on your stack:

call    *-8(%rbp)

Normally your functions will look something like this:
.globl wocky
    pushq    %rbp
    movq     %rsp, %rbp
    subq       $N, %rsp
    ;;; code

"subq    $N, %rsp", is making room on the stack for the local variables inside of the wocky function.  Obviously, you don't need to do this if you don't have any local variables, but it is also unnecessary if you are optimizing your assembly to have tail calls (and you are in a function that can be optimized to be a tail call ... and actually *is* being optimized to be a tail call) or if your function doesn't call any other functions (no reason to save stack space if no one else is going to be stomping on it).

As for me, I'm probably going to play around with assembly and perhaps implement a toy language or two.  However, I'm not sure that I'm really in the mood to try to play catch up with the likes of Mike Pall (check out luajit ... it's kind of frightening what he's doing with it) or the pypy team or the JVM team.  A technology like LLVM actually sounds really attractive to me because I can get low level implementation details (they support tail calls!) on multiple platforms and the whole thing is supported by people smarter and more dedicated than me.  So I get can better, faster, more correct, without having to spend a *lot* of time focusing on hardware details.

Either way though, I feel that poking around in assembly is still useful even in our modern era of managed languages and ultra IDEs.  I think understanding how the universe in which we work functions helps us to avoid certain mistakes and fully utilise the capabilities available to us.

Thursday, November 15, 2012

Haskell Types and Type Classes (4): Misc

Here's a few extra extensions to consider.

The following are legal:

class Jabber a where
    wocky :: a -> a

instance Jabber Int where
    wocky = undefined

instance Jabber [a] where
    wocky = undefined

This one isn't:

instance Jabber [Int] where
    wocky = undefined

Unless you are using the following:

{-# LANGUAGE FlexibleInstances #-}

A similar extension is

{-# LANGUAGE TypeSynonymInstances #-}

which let's you do something like:

type Ball = [Int]

instance Jabber Ball where
    wocky = undefined

This is kind of important because String is just [Char]; you will need it if you want to specialize on String without making some sort of silly proxy type.

{-# LANGUAGE TypeSynonymInstances #-}

instance Jabber String where
    wocky = undefined

Okay, remember this guy?

data SEither a b = SLeft a | SRight b

class Jabber a where
    wocky :: a b -> a b

instance Jabber (SEither e) where
    wocky = undefined

Well, you're going to need FlexibleInstances in order to do something like the following:

instance Jabber (SEither Int) where
    wocky = undefined

and also for the more exotic form:

data SMaybe a = SJust a | SNothing

instance Jabber (SEither (SMaybe a)) where
    wocky = undefined

Finally, last time I mentioned functional dependencies.  There's an alternative to them called Type Families.  I'm not going to go into very much detail about them because I haven't fully wrapped my head around them yet.  However, if you looked at functional dependencies and thought, "yeah that's the functionality that I want, but I really don't agree with the philosophy", then check out type families.  They seem similar to generalized algebraic data types (GADTs) at least philosophically, so if you like GADTs they might be the thing for you.

Wednesday, November 7, 2012

Haskell Types and Type Classes (3): Multiple Parameters and Dependencies

{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FunctionalDependencies #-}

Again this stuff isn't completely based off of the Haskell spec.  It's some stuff that was thrown into GHC and other Haskell implementations.

Multiple parameter type classes *look* similar to higher kinded type classes, but they are actually quite different.

class Jabber a where    -- a has the kind * -> *
    wocky :: a b -> a c

class Jabber a b where   -- a and b both have the kind *
    wocky :: a -> b

They kind of look the same ... they both have a bunch of letters that show up all over the place.  But once you have a good feeling for either parametric polymorphism and kinds or Haskell's type class system, you'll notice that they are doing different things.

And of course you can go crazy with the kinds of the variables in a multi param type class.

class Jabber a b where  -- a and b have kind * -> *
    wocky :: a c -> b c 

class Jabber a b where -- a has kind * -> * and b has kind *
    wocky :: a b -> a c

class Jabber a b c where  -- three type variables all with kind *
    wocky :: a -> b -> c

This is basically the power of function overloading that you see in languages like C++, C#, etc.  But of course there's still a lot of weird thing that you can't quite do, even though they make sense, *AND* then there's a bunch of compiler flags that allow you to do those things that made sense but you couldn't do.  I'm not even sure I really want to go into all of it.  I've browsed through the options before ... and some of them seem pretty straight forward, but on the other hand I can't make sense out of the others.  I suspect that if you just blaze forward with reckless disregard, you'll be able to pick up what you need to know as you find out you really need to know it.  As for me, I suspect I'll be back to figure it out once I run out of other things to figure out ... but for now I'm not sure I need to know the difference between overlapping, incoherent, and undecidable instances (although right after saying something like that is normally when you find out how important that stuff you rationalized as irrelevant actually turns out to be).

One additional compiler extension that I found myself spending the time to understand was functional dependencies.  They look kind of like this:

class Jabber a b | a -> b where
    wocky :: a -> b

So what is that horrible "|" all about and why would anyone care about it?  Well, once you can have multiple type parameters in a type class you get some funny possible outcomes.  Namely something like this can happen:

class Jabber a b where
    wocky :: a -> b

data Ack = Ack
    deriving Show
data Ball = Ball 
    deriving Show

instance Jabber Ack Ball where
    wocky a = Ball 

instance Jabber Ack Ack where
    wocky a = Ack 

It doesn't look so bad.  And in fact it can compile.  But if you run some code like:

wocky Ack

You will get a runtime error.  Which is kind of weird to think about Haskell doing, but it is a possibility in some cases.  The problem is that Haskell is unable to decide which instance of Jabber to use.  Notice that there is actually return type function overloading going on here (something that you do not see in languages like C++ or C# ... at least not yet).  Which means that if the compiler can tell what the result of "wocky Ack" is being stored in it would be able to select the correct function to call.

(wocky Ack) :: Ball

Ah, that's better.  No more runtime error.  However, perhaps you want to be able to prevent this kind of thing from happening at compile time.  After all, maybe the two instances are created in completely different files and a completely innocent accident is about cause your deployed *Haskell* code to produce a runtime error!!!

Well that's functional dependencies.  The following code refuses to compile.

class Jabber a b | a -> b where
    wocky :: a -> b

instance Jabber Ack Ball where
    wocky = undefined

instance Jabber Ack Ack where
    wocky = undefined

So the functional dependencies is basically a constraint on what instances are allowed to be defined for a given type class.  More or less it's preventing conflicting or redundant instances.

By the way, this shows up in some of the monads out there.  That doesn't mean you need to learn about it now, but keep it in mind for when you're looking at a monad definition and you see a bunch of "|"s popping up in disturbing places.

Saturday, October 27, 2012

Haskell Types and Type Classes (2.5)

In category theory a category is composed of objects and morphisms between the objects.  A functor is a mapping from one category to another category.  There are two parts to every functor a mapping for the objects in a category and a mapping for the morphisms.

Category theory works in Haskell because Haskell is composed of pure functions.  The Hask category is composed of types (the objects) and functions (the morphisms between the types).  The Functor type class in Haskell isn't exactly a true category theory functor.  The Functor type class only maps functions to functions (leaving out the types).  Additionally, it really only applies to one kind of functor ... like a functor from Hask to some subset of Hask.  So compared to Category theory functor it's kind of lame, but compared to the alternative of not having it it's kind of awesome.

class KFunctor f where
    kfmap :: (a -> b) -> f a -> f b

data KEither a b = KLeft a | KRight b
    deriving Show

data KMaybe a = KJust a | KNothing
    deriving Show

instance KFunctor KMaybe where
    kfmap _ KNothing = KNothing
    kfmap f (KJust a) = KJust (f a) 

instance KFunctor (KEither e) where
    kfmap _ (KLeft e) = KLeft e
    kfmap f (KRight r) = KRight (f r) 

jabber x = x + 5

kfmap jabber $ KNothing -> KNothing
kfmap jabber $ KJust 5 -> KJust 10
kfmap jabber $ KLeft "message" -> KLeft "message"
kfmap jabber $ KRight 5 -> KRight 10

The KMaybe KFunctor is relatively straightforward (at least as far as these things go).  The (KEither e) KFunctor on the other hand isn't so obvious.  Writing out the code for (KEither e) KFunctor helped me a lot in figuring out what was actually going on.  So again remember that KFunctor can also be defined like so:

class KFunctor (f :: * -> *) where

And that KEither is going to have a kind of * -> * -> *, but if we curry it, then we will get the desired kind for the KFunctor ... ie (KEither e) :: * -> *.  Now it personally bothered me that I had no idea about what the "e" was in the (KEither e).  But really the "e" doesn't actually matter because it doesn't show up in the type signature of kfmap ( ie (a -> b ) -> f a - > f b ).  The consequence is that the function referenced the kfmap ( ie (a -> b) ) will never have any way of dealing with the "e".  Which is totally fine depending on what usage you want to get out of it.  In functor's case we are more or less using it in the same way as the KMaybe is being used ... the only difference is that instead of KNothing you can actually have some sort of error message.  For a more complicated usage of KEither, KFunctor won't do the trick and you'll have to use something else.

Thursday, October 25, 2012

Haskell Types and Type Classes (2) ... Kinda

What would the following type mean?

data Ikky a b = Ikky a b

Well that would be a type Ikky that can be parameterized with two different sub types.  So ...

Ikky Int Int
Ikky [Int] Float
Ikky (Ikky Int) [Int]

These are all valid instances of Ikky.  Now what about this:

data Ikky a b = Ikky (a b)

Today's topic is about Kinds ... the Type of a type.

The appropriate GHC compiler directive is:

{-# LANGUAGE KindSignatures #-}

With it we can rewrite "data Ikky a b = Ikky (a b)" into:

data Ikky (a :: * -> *) b = Ikky (a b)

Let's talk about Kinds in a bit more detail.

So most types we are aware of have a Kind of *.  * means that it is a normal type, so ...

Maybe Int

Are all of kind *.  However, how might we specify something like ...


Well, we can sort of think about it as a function that takes a type as a parameter and returns another type as the result.  So * -> *.  This can be expanded into * -> * -> * and so on and so forth.  Also there's (* -> *) -> * type of kinds which are similar conceptually to functions like map ... they take a type "function" that goes from one type to another and they apply it to something, and then yield a new type.  And of course there's also * -> (* -> *), but whatever, it's more or less the same as normal function types except with types as the parameters.

Although it's not exactly the same.  In normal function types we see things like a -> a.  Here we just have a single *.  Doesn't really leave much room for parametric polymorphism on the type level.  And that's because it doesn't allow parametric polymorphism on the type level.  Why not?  Well there's not mathematical reason why you can't do it.  However for whatever reason they decided not to include polymorphic kinds or even user defined kinds in Haskell (actually I think recent GHC releases have introduced polymorphic kinds and user defined kinds ... but so far I haven't gone through the work of understanding their syntax, so let's just focus on normal kinds for now).  

Okay, so data Ikky (a :: * -> *) b = Ikky (a b) would be use like ...

x = Ikky (Just 5)

And "x" would have a type like ...

x :: Ikky (Maybe Int)

I'm honestly not really sure why anyone would ever want to do that sort of thing, but I've probably just not encountered any problems that are best structured in such a manner.  There are some type classes that make use of this sort of thing, so it's probably more interesting to talk about how kinds are used with type classes, but first consider the following:

x = Ikky (Left 5)

Now Left belongs to the Either type.

data Either a b = Left a | Right b

Just specifying "Left 5" let's us know that we're dealing with "Either Int b", but it doesn't tell us what the "b" is.  However, that's exactly what the Ikky type is all about.

So ...

x = Ikky (Left 5)
x :: Ikky (Either Int) b

And you might say, "but what can we do with the 'b'".  And the answer is nothing.  We don't know what concrete type is it, but we can put it into a container, other type constructors, or return it.  If that doesn't make much sense don't worry about it too much.  Just keep it in mind if you ever take a look at the Functor type class and see it applied to "Either e" or something like that.

Here's some type classes.

class Jabber1 (a :: *) where
    wocky1 :: a -> a

class Jabber2 (j :: * -> *) where
    wocky2 :: j a -> j b

class Jabber3 (j :: * -> * -> *) where
    wocky3 :: j a b -> j c d

data Ikky (a :: * -> *) = Ikky (a Int) 
data Ikky2 b (a :: * -> *) = Ikky2 (a b)

data Neep a = Neep a

class Jabber4 (j :: (* -> *) -> *) where
    wocky4 :: j a -> j a 

instance Jabber4 Ikky where
    wocky4 = undefined

x (Ikky Nothing) = undefined
x (Ikky (Just x)) = undefined 

instance Jabber4 (Ikky2 x) where
    wocky4 = undefined

All of that compiles.  The important thing to do in order to get the hang of this is to sit down until you're confident that you get the hang of what's going on.

However, there's the gist. 

class Jabber a where
    wocky :: a -> a

This means that you need to hand the Jabber type class a nice first order type and the wocky function will map from that type to that type.

data Neep = A | B | C
data Blah a = Blah a

instance Jabber Neep where
    wocky a = undefined

instance Jabber (Blah Int) where
    wocky a = undefined

Now, then there are type classes like this:

class Jabber j where
    wocky :: j a -> j b

which can be more clearly written like ...

class Jabber (j :: * -> *) where
    wocky :: j a -> j b

This type class demands a higher order type.  So the "data Neep = A | B | C" isn't valid any more, but the "data Blah a = Blah a" is valid as long as you leave off what type the "a" is.

instance Jabber Blah where
    wocky a = undefined

This is a fundamentally different function from the earlier wocky.  The first wocky was moving some type to the same type.  This new higher order typed wocky is moving one type to another type, but within some other container type.  And of course you can make those kind signatures really complex, so the story only get's more complicated.

So the first question might be, "why do all of this?"  And the answer is probably along the lines of, "this is the type of things that mathematicians like to dream up."  However, "why do this in Haskell ... it seems really complex and abstract", seems like a much more pragmatic question, and the answer is unfortunately, "when you want to do complex and abstract things."  Originally, you probably could have gotten away with not thinking about this and using Haskell ... but at some point someone squinted their eyes and said, "hey Haskell sort of looks like a category", and the next thing you know there's a mountain of category theory concepts that are a core part of Haskell.  There are certain Monads and Functors which I could not make sense of at all until I went through and figured out what was going on at the Kind level.

Now personally, I like the idea of kinds a lot more than sub-typing.  Sub-typing is of course the thing that most programmers are familiar with ... or at least most programmers are familiar with sub-classing and basically they sometimes use it to handle situations where you want sub-typing.  Unfortunately, few languages have any sort of useful kind stuff going on, so I haven't really been able to experiment with it very much.  However, GHC has some sort of polymorphic kind support now, and I've gotten a renewed interest in assembly.  So I can stare at what Haskell is doing with kinds to see what I think about it.  Or, maybe ... I can get some sort of rudimentary programming language up and running and experiment with type systems on my own.

Thursday, October 4, 2012

Haskell Types and Type Classes (1)

Let's pretend we have a simply typed language.  We can begin by populating this language with some primitive types.


This is a good start, but we might want to allow the user to create some types.

data ? = ?' <something>* (| ?'' <something>* )*

Okay, I'm abusing several notations pretty badly ... let me explain what we have here.

data is a keyword that tells the compiler that the user is defining a new type.

? is a place holder for whatever the new type name is going to be ... the user needs to fill this in for it to compile.

?' and ?'' (and even ?''' etc) are different constructors for this type.  We could just construct the type with a function by the same name as the type, but that's less interesting than letting the user provide many different possible constructor names.  Specifically, because some of the constructor names might take different parameters.

<something> is a preexisting type.  So we might see something like:

data Circle = Circle Float
data VennDiagram = Venn Circle Circle

The * on <something> indicates that we might have multiple somethings (or maybe zero) *and* they might be different somethings.  In the above example we have a constructor "Venn" that takes two circles as parameters.  

The ( ... )* indicates that we might have multiple constructors after the first one or we might have zero more constructs; each one following the same rules as the initial one.

Now we have primitive types and user defined types ... what about the types of functions?

<something> -> <something>

Okay, maybe an example that looks a little bit less pointy.

Int -> Int

Okay, cool.  This is a function that takes an Int and returns another Int.  So maybe negate or absolute value or something.  However we can make things a little bit more complex.

Int -> Int
Int -> Int -> Int
Int -> (Int -> Int) -> Int

There's a mathematical process known as currying which basically states that a function that takes multiple parameters is equivalent to a function that takes a parameter and then returns a function that takes the next parameter and so on.

λ x y z . x y z <=> λ x . λ y . λ z . x y z

Additionally, the arrow (->) is right associative which means that

Int -> Int -> Int <=> Int -> (Int -> Int)

And the upshot of that is that if you want to write a function type that indicates you take a function as a parameter you need parens ... so

Int -> (Int -> Int) -> Int 

That function type indicates a function that takes an integer, then a function that takes an integer and returns an integer, and finally returns an integer.

The simply typed language works, but it leaves something to be desired.  If you want an array or a list or some other container, then you have to make it by hand ... once for each data type you want to use it with.  Hardly ideal.

So how about parametric polymorphism.  That means that instead of writing new data types that take specific existing data types, we get to write new data types that take arbitrary data types.

data List a = Cons a (List a) | Nil

The 'a' can be any data type ... this allows us to write functions that can operate over a List of any type.

List a -> Int

So that's probably a function that returns the number of items in the list.  And we can also do things like this:

List Int -> Int

This one is probably a function that returns the sum or product of all the Ints in a List of Ints.  This can also be generalized some.

List a -> ( a -> b ) -> List b

This function takes a List of as and a function that can convert an a into a b ... err and a function that can convert an "a" into a "b" and then returns a list of "b"s.  This is called map or select in many different languages. 

What if we do something like this:

Pair a b -> (c -> c) -> Pair a b

This function takes a Pair type which contains one item of type 'a' and one item of type 'b' and then it takes a function from 'c' to 'c' and finally returns another Pair (after running the function over both the items in the Pair).  It seems about right, but it won't work in Haskell because the type checker will assume that 'c' is equal to 'a' (or actually 'b' if the first thing you do in the function is try to convert the second item in the Pair).  The type checker will then signal an error because you try to feed a function that takes an 'a' an item that has type 'b'.

To get around this you have to do something like this:

forall c . Pair a b -> ( c -> c ) -> Pair a b

This invokes Universal Quantification, which allows the type checker to realize that even though 'c' is going to be equal to 'a' it can also be equal to 'b'.  The whole point of going to parametric polymorphism was to allow functions that take arbitrary types after all, but the type checking algorithm has to be different if you're going to allow this type of higher order function type checking.  In the Glasgow Haskell Compiler (GHC), this option has to be activated using the following pragma:

{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE RankNTypes #-}

Alternatively, you can also activate this compiler extension by using a command line flag.  

Other Haskell compilers might not support this option ... I don't really keep up with what all of them support, so you'll have to check in on it if you're looking into a compiler other than the Glasgow one.

So, parametric polymorphism is pretty nice, but sometimes we just want to do something specific when we have a specific type.  This is where type classes come in.

class X a where
    methodName :: a -> a

instance X Circle where
    methodName a = a

instance X Square where
    methodName a = a

(X a) => a -> a

The class introduces a type class.

The instance introduces a specific instance of this type class.

The (X a) => ... indicates that you have a function that can take a parameter of any 'a' as long as that 'a' also is an instance of the type class 'X'.

Okay, so I'm basically going to leave off here because things start to get really complicated.  Type classes bring up the issue of kinds (the type of your types).  And they also bring up the issue of "what if I want to do ..." to which the answer is "you can't ... unless  you activate this compiler extension".  As far as I can tell the Haskell type classes in the standard are actually pretty simple (but allow powerful things to happen).  I haven't actually read the standard, but I'm just kind of inferring based off of all the compiler extensions GHC has.

The story that happens again and again with type systems is along the lines of, "why doesn't this work?"  And the answer is usually, "yes that makes perfect sense, but we didn't write the algorithm that way because it's hard or impossible or we wanted it to be easy for other implementations to match our specification or etc."  With Haskell it looks like they wanted to make the type system interesting and powerful, but also relatively easy to implement so that there would be lots of different implementations.  With GHC it looks like they took the liberty to implement a bunch of other interesting but difficult features on top of what already exists within Haskell.

I hope to cover a bunch of these extensions later (starting with what's available with type classes).  And yes this means that I'm going to be bumping into monads eventually ... although I'm going to try to avoid yet another "monads are really like ..." post.  The internet already has enough of those and I actually wanted to start a series like that before I realized what I had done and pulled the plug.  The gist, btw, is that monads are an abstract algebraic concept that is sort of related to Haskell because Haskell is only pure functions.  If you want to learn what a monad is, learn category theory.  If you want to learn how to use monads in Haskell, read a bunch of blog posts.

Wednesday, October 3, 2012

Dynamic vs static ... vs system f vs prototype inheritance vs macros vs eval vs ...

It's general knowledge in the programming world that there are two major types of programming languages.  Dynamically typed programming languages and statically typed programming languages.

It used to seem like there was an ongoing argument about which type of language was better.  However, recently it seems to me that no one really talks about this anymore.  People tend to use dynamic languages for some tasks and static languages for other tasks.  

A dynamically typed language will tend to look something like this:

function add( a, b )
    return a + b

class A
    function +( o )
        return o + _value

class B
    function +( o )
        return 5 + _value

add(, )

While a statically typed language will tend to look something like this:

int add( IAddable a, IAddable b )
    return a.add( b );

class A : IAddable

class B : IAddable

add( new A(), new B() );

They basically look the same, but the major differentiator is that a statically typed language will make you classify your constructs ahead of time, preventing your program from compiling if the classifications do not match up correctly, whereas a dynamically typed language will allow your program to run without checking your classifications, shutting down the program if a construct is asked to do an operation that it is unable to perform.

There are of course a bunch of different variations that make things even more complex.  The picture I have painted in the code above tells a story about writing a compiler with the machine in mind.  Classes are memory layout specifications, objects are pointers to memory slots, and functions are basically dereferences.  In this story there really isn't any difference between static and dynamic languages.  The static language is more verbose, but it protects you from simple errors and it gives compiler mandated hints to programers as to what is going on.  The dynamic language is more permissive, but it does nothing to prevent trivial failures and is a maze symbols without meaning.

I think things get much more interesting when you look at the variations.  Smalltalk, Self, and Lua all tell a story about ideal objects and messages that get delivered to them.  There's really no such thing as an operation on an object ... it's more like asking an entity "leche" and seeing if it understands what to do.  Or seeing what it will do.  On the other hand, ML and Haskell tell a story about mathematical rigor and functions.  You are no longer performing operations on anything really ... you're evaluating functions much in the same way as if you were performing it with pen and paper instead of silicon and electrons.

Finally, you add in "other features" and see how people begin to become weary of the process.  The familiar slots and memory addresses melt away and much more complex questions arise.  Metaobjects allow objects to decide in a turing complete way how they will react to your program itself.  You thought you knew what subclassing meant ... let's see how else we might define that.  Macros and syntax extensions change the very meaning of programming language at compile time ... what is the best language to solve *this* problem.  What if we define data types not by their memory slots, but model them after algebra?  What happens if we try to generalize that process?  If all our functions are pure, doesn't that mean our language forms a category?  I wonder if that allows us to do anything useful.  Maybe we want to find a loop hole in the halting problem, well there's a solution to that pesky mathematical proof depending on how hard you are willing to work.  Or maybe you want to go the other way and evaluate arbitrary code after your program is already finished.  

When it comes to most of the programers I have interacted with, they seem to have a pretty good grounding with memory slots and operations upon them.  And, for the most part, this seems sufficient to create software that can be sold.  However, I feel a bit weary of trying to model more and more complex problems by trying to think about how to fit it in a computer.  I would much rather be thinking about how to get the computer to fit my problem.

Sunday, July 15, 2012

Lambda calculus preference

So, why do I like lambda calculus?  Actually, that's probably kind of a hard question to answer, and if I went through the trouble I'm not sure that the process would really be beneficial for anyone.  I like it because I do?  Whatever.

Why might lambda calculus be useful?

First thing first.  In lambda calculus nothing comes from nowhere.  This is actually true in other logics as well, although the effect isn't as pronounced.  For example, in natural deduction you have axioms, which are propositions which are just assumed to be true.  They're stated with no proofs.  They seem to be coming from nowhere.  However, that's not really what a natural deduction proof means when you encounter an axiom.  What it actually means is that if you happen to be loitering around in the world and you encounter a proposition A and an inference rule A -> B, then you can assume the proposition B.  The axioms aren't things that come from nowhere, it's more like a pattern matching transformer.  You see this stuff you can get that stuff.

As far as I can tell, with most logics you don't care about the internal structures of your axioms.  So it's easier to pretend that they come from nowhere and get on with the proof.  However, with lambda calculus you do have to care somewhat about the structure of the parameters of your functions.  If you misuse a parameter, your computation could yield a totally worthless value or it might not even be possible to finish your computation.

This should mean that lambda calculus is a good medium to use in order to think about and understand causality and causal machines.  So, for example, computers are causal machines.  Does someone want you to do something with a computer?  Well does it seem like it would be possible with lambda calculus?  Then it should be possible with a computer.

More pragmatically, if you really like lambda calculus, then you won't use singletons unless it's actually happens to be a good idea for your situation.  

Secondly, lambda calculus can only have immutable data structures.  This is similar to the previous point.  If you're in a language that allows mutable data structures and references ( ie most of them and all industry languages) then a mutable data structure can be used as a hidden communication channel.  And maybe there's some situations where you want to be able to enact change to another part of your program without actually going over there and enacting change.  But most of the time this leads to difficult or impossible to understand code and bugs.  If you really like lambda calculus then you won't tend to use hidden channels of communication.

Of course you have to be careful with how you try to apply lambda calculus to programming.  Lambda calculus is turing complete (in fact it was turing complete before it was cool).  This means that you can compute any function, it does not mean you can implement any algorithm.  For example, lambda calculus doesn't have IO.  Which means you can never communicate with anything.  If you have a problem which requires you to monitor the state of some external device, lambda calculus is probably not the best tool for the job.  If your environment doesn't have garbage collection, maybe you should pass on lambda calculus.  

Overall though, today's boring industry jobs are favoring more and more environments that are garbage collected and provide first class anonymous functions.  Best practices are starting to converge on immutable data structures, pass by value semantics, and favors some type of dependency injection over some types of singletons.  I think this meshes close enough to lambda calculus to make it a useful tool at the hands of developers.

Lambda calculus

Lambda calculus is pretty simple.

λ x . x


(λ x . x) x

x x

(λ x . x x) x

That's more or less all of the types of things that you are likely to see in lambda calculus.  

Functions are the only thing in lambda calculus.  You can create functions with λ and you can call functions.  Abstraction and application.

λ x . x x 
(λ x . x x) x

Function application is left associative.  So "x x" is the function 'x' taking itself as a parameter.  "x y z" can be rewritten "(x y) z", meaning that 'x' takes 'y' as a parameter and returns another function which takes 'z' as a parameter.  If you want things to evaluate from the right you have to use parentheses.  "x (y z)", the function 'y' takes the function 'z' as a parameter and then the result is passed to the function 'x'.

Functions in lambda calculus only have one parameter, but multiple parameters can be simulated by returning multiple functions from the original one (see currying for more information about this type of thing).

(λ x . λ y . x y)

It's not a big deal to just create functions that take multiple parameter and then just assume that the real deal uses currying to achieve the same effect.

(λ x y z . x z ( y z ) )

To calculate function application you just replace the variable with the function that was defined earlier with λ.

(λ x . λ y . x y) (λ z . z) ( λ w . w )

= ( λ y . ( λ z . z ) y ) ( λ w . w )
= ( λ z . z ) ( λ w . w )
= ( λ w . w )

What happens if some of the variable names of one of the functions you are inserting is already used in your function?

( λ x y . x y ) ( λ y . y ) ( λ w . w )
= ( λ y . ( λ y . y ) y ) ... um ...

This doesn't seem right because you will get a different answer to this equation than if the second function had variable names inside of it that were different than 'y'.  This is solved with alpha conversion, which basically says that you can rename the names of function variables so that they don't conflict with another function's variables.

( λ x y . x y ) ( λ y . y ) ( λ w . w )
= ( λ y . ( λ y . y ) y ) ... quick alpha conversion 
= ( λ y . ( λ z . z ) y ) ( λ w . w )

There's also some other rules that involve greek letters.  Beta reduction is function application by rewriting a variable with its parameter value.  This is just function application that we covered before.

There's also eta conversion, which says that two equations are equivalent if their forms reduce to the same value.  This comes in handy if you want to lazily or conditionally evaluate an equation.

( λ x . z x ) = z

Also of note is the y combinator.  So far I haven't mentioned any way to globally name a function, and that's because there is no way to globally name a function.  This is a problem because it means that you can't have recursive functions because the function cannot call itself because there is no global name to call.  Lambda calculus doesn't have any other type of iteration, so things suddenly become kind of worthless unless we can find a way to pass a function into itself, so that it can call itself.

Y = λ f . ( λ x . f (x x) ) ( λ x . f (x x) )

The y combinator doesn't really work in languages that evaluate all of the parameters of a function before evaluating the function itself (however it works fine in lazy evaluated or if you are manually rewriting the function on paper).

Z = λ f . ( λ x . f ( λ y . x x y ) ) ( λ x . f ( λ y . x x y ) )

We just use the eta conversion trick we learned earlier to rewrite the y combinator to be an equivalent function, but lazily evaluate parameters fed to 'f'.  I've seen this called the z combinator in a couple of places.

Numbers can be represented with functions, called Church Numerals (Alonzo Church invented lambda calculus).

Zero = λ f . λ v . v
One = λ f . λ v . f v
Two = λ f . λ v . f ( f v )
Ten = λ f . λ v . f ( f ( f ( f ( f ( f ( f ( f ( f ( f v ) ) ) ) ) ) ) ) )

This is the same idea behind the peano axioms.  However, instead of having a number which is just a blob of data, these numbers represent calling a function on a value some number of times.  In other words the number ten is like a loop which runs ten times.

Here's an increment function:

Inc = λ number . λ f . λ v . f ( number f v )

The intuition is pretty straight forward here.  Take a number and then take the function and value that the number would have taken.  Then run the function with the original f and v.  Finally, run the result one more time through f.  Conceptually it's like you've added one more f to the series of fs that originally created the number.

After that addition isn't too difficult.

Add = λ number1 . λ number2 . number1 Inc number2

Here I'm just inserting the 'Inc' function's name instead of writing the whole thing out.  This isn't legal lambda calculus, but it is a good short cut to make the equation recognizable.  I could just replace 'Inc' with the equation I wrote above to make my Add equation legal.

But anyway, what's going on with Add?  Well, I'm running the Inc function number1 times on the value number2.  In other words, I'm adding one to the second number a number of times equal to the first number.  Let's try it with normal numbers to get the intuition.

5 + 4
= 4 + 1 + 1 + 1 + 1 + 1
= 5 + 1 + 1 + 1 + 1
= 6 + 1 + 1 + 1
= 7 + 1 + 1 
= 8 + 1 
= 9

Okay, looks good.

The decrement function is pretty crazy.  I had to look it up in order to figure it out, and even then I had to think about it for a very long time to convince myself about what it was doing and that it made sense.  There's a couple of different ways to do this, I'm picking the one I first got the hang of.

Dec = λ number . λ f . λ v . number ( (λ a . λ b . b ( a f )) (λ u . v) (λ x . x) )

The best way to figure it out is probably just to run through it... Which I'm not going to do right now because it's going to take forever.  I'll try to write a simple lambda calculus machine sometime that works via parameter rewriting and saves all of its steps.  That way I can just implement decrement and then copy and paste the output.  

There's also a bunch of other stuff that goes on with lambda calculus.  However, I don't appear to be that interested in it.  At least I like to know how the machine works, but I don't like to know the endless commentary on weird patterns that show up when the machine is actually working.  Sometimes that sort of thing is good to know because it will save you from spending your time figuring out things other people have already figured out (some of that stuff takes hundred of years for people to figure out ... well not for lambda calculus because it hasn't been around that long, but things like Fermat's last theorem come to mind).  However, most of the time it seems like reading things other people have already figured out is boring because it's like reading about how someone else had a very fun time of solving a sudoku puzzle.  I want to be the one solving the sudoku puzzle, not reading about how someone else noticed that 3 was already used in this box so the correct number had to be 9.