Wednesday, August 14, 2013

Cognitive Complexity Classes




In a Galaxy far far away

LANGUAGE WARS

When I started my professional career as a software engineer (six years ago yesterday actually), I really only knew a couple of languages.  I received my Bachelor's of Computer Science at Purdue University.  While the breadth of topics offered at Purdue is certainly impressive, the programming languages covered certainly leaves some room for improvement.  I had an introductory course in Java and then everything else was pretty much C or C++.  Granted there were some alternative possibilities that cropped up from time to time.  

For whatever reason, beginning a career in software engineering really got me interested in programming languages.  I did some research on my own previously, but actually programming for a living filled me with a sense of professional pride.  I ought to understand the tools that I'm using to create products.  Of course I've always been interested in understanding how things worked.  Maybe it was only natural that this would overlap with my work. 

I began trying to figure out the features of programming languages.  So dynamic typing, eval, meta objects, object oriented, lisp macros, pattern matching, automatic theorem proving, the works.  Type theory took awhile to figure out.  After all there's a bunch of math involved, and it's not really the kind that I had prior exposure to.  

Through all of this a couple of weird issues kept making an appearance.  Turing completeness and the power of more expressive languages.  Any turing complete language is able to compute all of the functions that any other turing complete language is able to compute.  Nearly all programming languages are turing complete (and none of them are "super" turing complete), so the question naturally arises … "why do we need different programming languages?"  Out come the vague, poetic answers.  You can work better or faster in such and such a language.  Or this language has features that prevents certain mistakes.  Programers who use language X tend to be better because they are more passionate about programming.  And there are many such ideas like this.  Some tied to conduct studies by giving students without prior exposure to programming different languages to complete a specific task.  But it's not really clear how this applies to experienced practitioners.  After all, shouldn't knowing both languages in a given study be better than just one.  And how do abstract, language agnostic, problem solving skills factor into any results?

It's Complexity, but not as we know it

Kind of a staple of a computer science education is algorithm complexity theory.  Big O notation and its ilk.  I'm probably being overly cynical, but algorithm complexity theory might be the only real difference between an educated computer scientist and a self taught programmer.  (And I know I'm being overly cynical when I say most computer scientists forget all about Big O after the final, but sometimes it's hard not to think it.)  Honestly though, the theory and concepts from Big O and its friends don't really show up all that often in practice.  But.  When you are able to leverage these disciplines your peers will look at you as if you're a dark wizard who wields powers beyond human comprehension.  Seriously, all I did was replace a list with an array.  Just because it changed the application from being broken and without hope of repair to working better than expected doesn't mean I did anything *that* impressive.  This power can be yours as well.

The basics are actually pretty simple.  Given some number of inputs to an algorithm, how fast does the memory requirements and time to compute grow with one more input.  So if an algorithm's time complexity is linear, then the time to complete a computation grows proportionally with the number of the inputs.  Or in other words:

input - time
1 - 1
2 - 2 
100 - 100

If an algorithm's time complexity is "N Squared", then we're looking at something like:

input - time
1 - 1
2 - 4
...
10 - 100
...
100 - 10000

And obviously there's a lot more that goes into all of this when you start to get more formal.  But I think you see the point.  For small input's it doesn't really matter what the complexity of your algorithm is.  But once you hit large input, it really starts to matter.  Maybe 10,000 time units doesn't sound like a big deal, but consider someplace like Google, Facebook, or the IRS.  An input of 300 million can suddenly turn into 9E16.  Or perhaps I should rewrite that as the more impressive, 90000000000000000.  That's kind of a problem.

However, while algorithmic complexity is clearly very important and has an important place in the design of software, it's not the type of complexity that I really want to talk about.  I've noticed a different type of complexity.  One which I feel answers questions about a lot of personal experiences that I've encountered when trying to implement certain types of algorithms, but which more importantly answers the question of "what's the real difference between programming languages" without resorting to a bunch of subjective statements about productivity. 

Instead of (or more likely "in addition to") classifying algorithms by their time and memory complexity, I want to classify algorithms by how hard they are to comprehend.  On its face my goal is violating my desire to avoid subjective statements.  After all who gets to decide the difficulty of comprehending an algorithm.  We don't even have a good definition of what comprehension itself *is* let alone how to prove that someone actually has a comprehension of a given subject.  Actually, if I could define and prove comprehension, I wouldn't even bother with my "new kind of complexity".  I would just set up shop as a recruiter who could mathematically guarantee hires who actually know what they're doing.  And I would retire soon after very wealthy indeed.  

So simply by the elusive nature of comprehension, I think my idea of complexity will be a bit loose around the joints.  However, while a completely formal definition may or may not be forthcoming or even possible, I believe that I can argue convincingly that there are different cognitive complexity classes for algorithms.  With perhaps some fuzzy overlap between the classes, to make room for different types of human cognitive talent.  After all there may be some people who really are better at solving a problem "the hard way" due to varying circumstances.  Anything from how their brain is wired to the way they were raised to solve problems.

It is very complex.  You are likely to be eaten by a grue.

Right now I've got four complexity classes in mind.  This almost definitely needs to be expanded and modified, but I think it's a good place to start.  The way that I'm classifying these classes is by describing the "shape" of the algorithms.  

The simplest class is the Scalar class.  Followed by the List class.  Then the Tree class.  And finally the most complex is the Graph class.  

Like I said, I don't think these categories are absolute, final, or complete.  I think it's just the starting point.  

An algorithm that has a Scalar complexity can be easily imagined as a function that takes a scalar as input and output.  So logical-and is a good starting place.  Take two booleans and return a third boolean.  The structure of the inputs is atomic.  There's not that much going on here.  It's also easy to see where things get fuzzy.  A function that takes an array and then reallocates the memory of that array in order to expand the total size is probably Scalar complex.  But a function that takes an array and then maps a function over each value in that array, creating a new array with new values, should probably be in the List complexity class.  

The Map (or Select depending on your preferred terminology) function that I just described in my previous paragraph may seem like a Scalar complex function.  After all you take a completely opaque function, apply it to every element in a mostly opaque array, and finally return a newly constructed and mostly opaque array.  The structure of the list doesn't really matter *that* much.  So part of this is the unfortunately fuzzy nature of comprehension.  The Map function seems pretty easy to someone who's had a lot of experience with functional languages.  However, there are two things to consider.  The first is that if you look at the "shape" (admittedly a vaguely defined term) of the algorithm, you'll see that it's pretty easy to convince yourself that it has a "Listy" shape … or at the very least it's more listy than a logical-and function.  The second is that incorporating Map into your code in a functional language is very easy to imagine.  It yields clean and composable constructs.  However, attaining Map like functionality in an imperative language like C is a very different story.  A major component of my motivation for this cognitive complexity theory is that it should help explain what the difference between programming languages actually are.  List complexity isn't a big deal in functional programming languages with higher order functions, but in C like languages you need to be more careful with it.  I'll talk more about language differences later.

So, I've already sort of covered List complexity.  But there is some more that needs to be dealt with.  Functionality like Map, Fold, and Unfold all seem like they belong to a class of functions that have a list like "shape".  However, things unfortunately get fuzzy again when you consider lists of lists of scalars.  In some situations you may consider a list of lists to be a tree structure and in other situations the lists could be considered to be atomic in nature.  It really depends on the type of algorithm and how comfortable you are with these types of operations.  Although, the more operations you need to perform on the inner lists the easier it is going to be to argue the shape of the algorithm as a tree.

Tree complexity.  The most obvious tree complex algorithm that I can think of is the Permutation function.  A function which returns all of the permutations of a list is very naturally described in a recursive manner.  Tracing out the function calls shows what is very obviously a tree.  O (n log( n ) ) sort methods also have a very tree like shape to them.  After all several of these types of sort methods specifically exploit the nature of trees in order to get their time complexity.  Again you can start to make arguments based on different non-recursive implementations of these algorithms, but I noticed something interesting recently concerning Levenshtein distance.  

The definition of Levenshtein distance[1] is about as recursive as you can get.  Each node of the call graph makes multiple calls to itself.  The resulting "structure" over time is a massive tree.  However, implementing the Levenshtein distance formula this way is overly wasteful.  The way that the function is normally implemented is by constructing a table and then finding the minimal path through this table (actually, until recently this was the only algorithm I knew about for Levenshtein distance).  And there's even an optimization which only requires two rows of the table.  Now, these alternative implementations don't necessarily seem tree shaped.  However, I would like to argue that starting with a recursive definition and then transforming it into a non-recursive definition is akin to converting a recurrence into a summation.  So my hope is that it can be argued that the recursive form of algorithms like Levenshtein distance is the "more true" form for purposes of classifying cognitive complexity.  At the very least I think it won't be too hard to argue that the non-recursive form of Levenshtein distance is more *complex* than your typical listy algorithm.

Finally, there is Graph complexity.  I haven't gotten much opportunity to play around with algorithms that seem squarely in the Graph class, so I'm not really going to say too much here.  Parsing certain types of grammars seem like they need some sort of graphical shape or structure in order to function correctly.  Additionally, Simon Peyton Jones published an interesting book back in 1987[2].  In it he describes a G machine (graphical reduction machine) that can be used to implement the lambda calculus base for a functional programming language.  My naive thought is that if graphically shaped algorithms are strong enough to implement the turing complete lambda calculus, then maybe we don't need anything more complex.  Perhaps there is a maximum amount of cognitive complexity we have to put up with before we can show that there has to be a simpler way.  I have a feeling that this idea will either lead the way for incredibly useful algorithm simplification tools or turn out to be hopelessly optimistic.

Fight a complexity G specimen without damage control, you're crazy.

A major goal I have here is to show that different programming language are different in more than subjective anecdotes.   I think that I have at least highlighted one possible route for a different type of complexity.  A complexity that doesn't concern itself with time or space, but instead with comprehensibility.  And I have already mentioned briefly how we might use such a definition of complexity to differentiate programming languages.  

Basic List complex algorithms don't seem like a very big deal when you have access to functions like Map and Fold.  Things begin to become more problematic when all you have access to are for and while loops.

Additionally, Map and Fold can be safely used in a dynamically typed language.  However, once you begin to embed lists inside of lists, parametric polymorphism found in languages like Haskell, Ocaml, and C# really start to pay off.

Finally, just because nearly all of the languages available to use today are turing complete, doesn't mean that this is beneficial when the complexity class of an algorithm begins to increase.  In a language like C where the main algorithm solution mechanisms of choice are loops and if statements, when Tree or Graph complex problems are encountered the solution will be some sort of state machine constructed in a while loop filled with continue statements.  We've seen with the Levenshtein distance algorithm that such solutions can be of high quality.  However, my fear is that many engineers will not even be aware of the existence of highly complex problems or their need to be carefully transformed in such a way that they can be correctly implemented.  Worse yet, if the only tool that these engineers have a mastery of are loops and if statements, the end result maybe an ad-hoc and fatally incomplete state machine, even in more typical languages like Java and C#.  Suddenly, turing complete shifts meaning from incredibly useful property to enough rope to hang yourself.

I believe that the goal of my efforts should be to identify and elaborate on different cognitive complexity classes.  Then I can identify which programming language constructs most lend themselves for solving these classes of problems.  This will provide guidance for future programming language advancements as well as educational efforts for the best way to use those advancements.  Developing this idea into a formal system would be icing on the cake, but I feel at the very least that convincing arguments can be made for the utility of cognitive complexity classes.



No comments:

Post a Comment