Monday, May 6, 2013

Industry Languages VS Super Languages (part 2)

If you want some insight into my motivation, mood, and personality when it comes to the difference between the industry standard languages like Java and the esoteric languages that commonly get passed up in corporate environments like Lisp or Haskell, then this post has a part 1.  Go check that out.  However, if you're strapped for time or just want to get to the point and pass judgment, feel free to start here.

I already mentioned the classic objective measure for programming languages.  Is it turing complete or is it sub-turing complete?  With a turing complete language, we know how to create functions that can perform every calculation that we know how to perform.  With sub-turing complete languages, we can still create functions that can perform calculations, but we can't create all of the functions that can be created in a fully turing complete language.  

I want to add one more idea.  Fixed functionality.  Think about it this way.  If a turing complete language can create a light, it can create lights that can do anything that is doable.  Red lights, blinking lights, moving lights, whatever.  Well, sub-turing languages can create lights as well.  They can't do as much as the turing complete language can, but perhaps it can make red lights and blinking lights.  The fixed functionality language can't even be called a language.  It's either there or it's not there.  If you use it you get a light, but you have no control over the light.  It's a red light that's always on.  If you don't use it it's not there.  

Now, fixed functionality isn't necessarily bad.  The down side is pretty obvious.  If you want a green blinking swaying light, then you are out of luck.  You only get red lights with fixed functionality.  But on the other hand.  If you happen to need a red light, and you don't want to think about it very hard.  Then you get yourself a red light.  You don't have to tell the language to make a light, and then make it red, and then make it stationary, and then make it constant.  You just say LIGHT.  And you get exactly what you wanted.  So there's a sort of cognitive simplicity provided.

So now we can quality languages as turing complete, sub-turing complete, and fixed.  

Before we go onto differentiating languages, let's talk about some of the features in languages.  Well first of all every language is parsed.  That is, you provide some sort of textual input to the computer that humans can understand, and the computer turns the text into a bunch of bytes that the CPU can understand.  Also there are keywords.  These are specific special words in the language that have a single meaning to the computer.  And finally, let's say function dispatch.  So normally, if you call a function, then you get a very specific set of calculations.  Sometimes, though, you can get one of several different sets of calculations depending on the applications of some rules.

Okay, so as far as a language goes.  It doesn't make a lot of sense to have a fixed language.  That's more of a light switch than something that you would program with.  But what about those features I just mentioned.  Well, in Java the parsing is fixed, the keywords are fixed, and the function dispatch is sub turing complete (interfaces and inheritance allows a non-fixed decision as to what the actual function actually is).  The C programming language has a sub-turing complete parsing process (macros), fixed keywords, and mostly fixed function dispatch (although there are function pointers).  Now consider Common Lisp.  Turing complete parsing (reader macros can call turing complete functions), turing complete keywords (macros can call turing complete functions), and … well I'm not sure if the common lisp object system (CLOS) actually allows arbitrary function dispatch, but it does allow quite a bit, so at the very least we can call it sub-turing complete.  

So nearly every language is turing complete, but the differentiation I'm trying to raise here is where else is it turing complete.  The super languages tend to have a lot more turing completeness available.  Haskell has turing complete parsing (at least it has a Quasi Quoting extension that provides this), turing complete keywords (again the Template Haskell extension), and a turing complete type system.  Forth has turing a turing complete parsing process.  Etc etc.

The industry languages tend to have less turing complete features available.  With the canonical example of Java, nearly every aspect of the language is sub-turing complete.  And generally features that are turing complete are considered harmful.  Most people distrust macros in C and templates in C++.

This is why people talk about being able to do *more* with Lisp.  The final product is just as turing complete as everything else, but with Lisp there's so many more places where you can take advantage of turing completeness to help you do your job.

However, this isn't always a good thing.  And I think this is why we won't be able to say that the Super languages are better than the industry languages for a long while.  Most people aren't entirely sure what to do with turing completeness.  On the one hand if you gave them a sub-turing complete language, they wouldn't be able to figure out how to transform their problem into a representation that is actually solvable by their language.  And when you give them a turing complete language they immediately begin messing things up.  After all, if you're able to calculate all calculations that are possible … you can write a bunch of terrible code (up to the limit of what humans can comprehend and beyond as long as you remain lucky) and there's not much objective evidence for claiming that it's bad … because it does sort of work in the end.  Then once you give them a language that has a lot more turing complete aspects and features, they mess things up in more places.  I think this is a natural consequence of computer programming being a relatively new discipline.  Most people don't seem to understand the right way to do things.  Those who do, don't understand how to teach it to others.  And finally it's not clear how much of a good technique is due to it's objective goodness and how much of it is due to the way the person using it thinks.

Anyway, at the end of the day I don't think it's useful to give high praise to many of the esoteric super languages.  I think a much better categorization technique is to indicate which features are turing complete, sub-turing complete, or fixed.  Then continue to describe why you need what features to be what status in order to best solve the problem in your given domain.

Industry Languages VS Super Languages (part 1)

I'm not sure that it is entirely accurate to describe myself as a programmer.  I have a very strong interest in understanding how things work.  This trait makes me an effective programmer and software engineer, but understanding how things work is more important to me than programming or engineering.

I started off using C at Purdue.  Java, Perl, and some other languages showed up.  But the vast majority of everything was C.  When I started working, I used a lot of C#.  These are the typical programming languages that people are used to.  It's what you're likely to see if you walk into an arbitrary software development company or division.  However, because I'm very interested in how things work, I spent a lot of time investigating more exotic programming languages.  

Almost immediately I encountered languages like ocaml, forth, and lisp.  These languages have many interesting features that are typically completely unrepresented or at the very least shunned in the industry standard languages.  I spent a bunch of time getting my head wrapped around complex features like macros and highly expressive type systems.  And I also spent a bunch of time trying to wrap my head around why these features aren't present in the industry standard languages.  Apparently, a bunch of people who really like the less popular languages have also spent a bunch of time trying to explain why their super languages aren't more popular.

I suspect that my perspective is different than most.  Even among the outliers, my focus has been different, so I believe that this has influenced my outlook to also be different.  I sought out different programming languages because I was unsatisfied with what the traditional languages offered, but most of my motivation was based off of pure curiosity.  As I moved from language to language, I never found an environment which made me feel comfortable and satisfied with what is possible.  I found that I was much more interested in learning and understanding the languages as opposed to actually using them for anything.  Most of the defenses of Lisp and Haskell, that I've seen, seem to be from the perspective of people who actually use them to do things.  My not quite a defense but more of a categorization of Lisp and Haskell (and other languages in a related position) is coming from a perspective of trying to figure out how these languages relate to the rest of reality.

There are two obvious categories you can place a language into.  Turing complete and non-turing complete.  This distinction isn't really that useful to people because it's not really obvious, even to people who know the definitions, what the difference is.  A turing machine is able to perform the most powerful calculations that we know how to calculate.  We can propose and describe calculations which would require a machine more powerful than a turing machine, but we don't know how to actually perform those calculations.  They are mathematical problems that we don't know how to solve.  A programming language being turing complete means that it can perform all of the same calculations that a turing machine is able to perform.  Or in other words it means that it is able to perform the most powerful calculations that we know how to calculate.  A non-turing complete language is then a language which is not able to perform all of the calculations that a turing machine can.  We could call it a sub-turing complete language because it's abilities are below that of a turing complete language.

Now, in practice even people who get the gist of what turing complete means wouldn't actually be able to describe the difference between calculations that require a turing machine and calculations that can be done in a sub-turing complete language.  But the idea of turing complete languages are useful at the very least because it's an objective measure.  We can mathematically show that a given language is turing complete or not.  Unfortunately, we do not really have any other objective measures for programming languages.  Don't misunderstand.  There are other objective-ish things we can say about programming languages, but few people actually agree which of those objective things are better and which are worse.  And when the pedants and/or experts get involved the first thing they do is muddle the water as to what commonly understood features actually mean (because at the end of the day, things are always more complicated than we would prefer).  At the very least with turing complete vs not, we are able to say that if a language is turing complete then it can do all of the calculations that we know how to do and alternatively it is not able to do all of them.  So if you think about it, you probably want the turing complete language because that way if the problem is solvable, then it is solvable in your language.  (Of course, there's some interesting situations that arise where you might actually want a sub-turing complete language, but let's ignore that for now.)

Basically every language is turing complete.  There's a couple of outliers, but by and large all of them are turing complete.  So, Java is turing complete and so is Lisp.  This is a problem for people who really like Lisp (and/or the other misfit languages) because when they try to convince other people about the superiority of their language, the best objective measure (turing completeness) indicates that Lisp is in fact equivalent to the boring language that everyone else is using.  Now I think that part of the problem is that different people think different ways.  Some people will work better in Lisp and some people will work better in Java.  It's difficult to argue about superiority when what you are actually trying to express is very specific thought processes that are unique to yourself.  It seems that people usually resort to poetry.  

There's the turing tarpit … where everything is possible but nothing is easy.  A language that doesn't change the way you think isn't worth learning.  You can write the solution to your problem in your language, or you can write the language that solves your problem.  And this is probably even applicable to language slogans.  Consider perl's "there's more than one way to do it" and python's "there's only one way to do it".  I didn't even have to look these up, they're just in my DNA because of how many times they show up when people start to talk about the differences between languages.  And while many of these sound nice, at the end of the day they're not much more than poetry.  It's people trying to tell a story or convey a personal experience.  And even if these people are correct, they are really only correct for themselves and people like them.  After all the perl slogan is diametrically opposed to the python slogan.  When you express something personal, you run the risk that what you say is only true for yourself.  Of course that doesn't mean that you shouldn't still try.  After all, communicating with other people is vitally important to living.  And, like with any art, by expressing yourself you open the possibility of profoundly affecting and changing the lives of others.

I've said a lot here and I haven't quite gotten to what I actually started out to say.  I think if I continue, I'll end up making this post too long and too fragmented.  So next time, what does David say is the real difference between languages like Java and Lisp.