Stack Allocated: August 2013

Wednesday, August 21, 2013

Communication of the Paradox Parsers

I Know, Right ?

A difficult aspect of communication is that if you really understand a thing, then it is trivial to you. So why can't all those other people get with the program. And on the other hand. If you don't really understand a thing, then maybe you're the one causing the difficulty by expressing ideas which in reality do not make any sense.

I think that I have a pretty good sense of when I don't fully understand something. Although really, that statement needs about a billion caveats. For starters, "I know what I don't know", or perhaps more accurately, "I know when I don't know", is really way too close of a cousin to, "I'll tell *you* about peanut butter!". You really have to be careful when you start down the path of, "I know that I'm right!". Because as soon as you encounter a scenario when you're not right, it can easily lead to a disaster. Reality doesn't care that you think you're right.

So to be more exact. I really like to understand how things work. And I really like to trace my understanding as deeply as I can manage. And it bothers me on some sort of bizarre semi emotional level when I don't understand "sufficiently" why a given thing is operating in the way that it is.

This bizarre discomfort that I feel about my lack of understanding is a great tool for understanding when I'm missing something potentially important. So as long as I'm listening to my own internal discomfort and lack of understanding alarm, I have a pretty good sense for when I don't understand what's going on. At least I suspect that's the case … when you don't know what you don't know it's hard to know when you don't really know something.

If you want out then you'll have to develop those communication skills

So, I've got a skill that helps me to identify things that I don't fully understand. This comes in handy with communication because I'm always on the lookout for situations where things just aren't making any sense. So, I'm aware when there's a person who isn't following along in a conversation. When I'm more of an observer to the conversation, I find it really easy to decide who it is that's missing the big picture. I can then offer clarifying statements to the clueless or ask stupid questions to get those in the know to offer additional insight. Of course doing this can jeopardize my observer status, so it does sometimes take a bit of finesse.

If I'm highly active in a conversation and it seems like sense has decided to go on a lunch break, then I have to start running experiments. Either I'm the person who doesn't understand or one or more of the other active participants are the ones who don't understand. Bonus points if there's more than one type of misunderstanding going around.

But I didn't really want to talk about communication. Even though it is a very interesting problem to solve. I was interested in talking about a couple of things that don't make any sense.

Parsing!

I've had a terrible time with parsing algorithms. I find the basic idea of converting a stream of symbols into a data structure absolutely fascinating. But often the algorithms for achieving this are hard for me to really get my head wrapped around.

I think my blog entires about Cognitive Complexity Classes (CCC)[1] and Problem Shapes[2] actually shed some light here.

First of all, let's talk about the stuff that I don't understand. Parsing is more complex than just converting a stream of symbols into a data structure. Different streams of symbols have different implicit structures. The main ones that people talk about are the different types of grammars that languages fall into, and this is really talking about formal languages (ie programming languages) as opposed to natural languages (ie French). Context sensitive, context free, and regular grammars. Context sensitive grammars are some of the hardest to parse and they require some very powerful algorithms. Similarly, context free are easier and regular grammars are even easier; both can be parsed with increasingly simple algorithms. And this is only scratching the surface. There's LL and LR grammars. That is the set of language grammars which can be parsed by LL algorithms and/or LR algorithms. Then there's LL(1) grammars. And the rise of even more sophisticated algorithms give rise to even more complex sets of language grammars. And to make things even more complicated, these algorithms aren't just interested in the power of the grammar they can parse. They are also interested in parsing efficiency in both space and/or time.

So when you see a string of characters, there's actually a lot of extra hidden structure that you don't really understand present. Or at the very least *I'm* not happy with my level of understanding. I get the basic gist of Context Sensitive vs Context Free, but I don't get it sufficiently to look at some example strings and have a good intuition for which type of grammar the language falls into. And things get more complicated. If simple things like context baffle my intuition, then it's probably going to be a while before I figure out how to spot GLR grammars.

The whole CCC and Problem Shapes thing becomes useful in describing why this is all a complicated problem.

I'm not entirely sure what shape parsing happens to be. Something tree or graph shaped with backtracking. What shape is constraint propagation and/or unification? Anyway, so let's assume that parsing (all parsing) has some sort of known shape. Then add to the mix different power levels of grammars (regular, context free, etc). Then add to the mix efficient time and space constraints. So we begin with the already complex shape of parsing. Then we add a bunch of very complex shapes to get different characteristics that make parsing practical. And finally, we squash the whole thing into a table and implement it in a for loop.

So of course parsing is hard to *really* understand.

Although, I didn't really want to talk about parsing. Well … not computer language parsing. I actually wanted to talk about math. And more specifically, why is it so hard to understand mathematical statements.

… math blah blah blah or how paradoxically paradox aversion produces paradoxes

I was watching a bunch of group theory videos[3] yesterday. And I noticed that some of the picture examples and intuition behind the presentation was pretty simple and straight forward. But the precise definitions were hard to wade through even with the intuition.

This got me thinking about parsing. How I can't quite seem to feel comfortable with grammars and algorithms even though I've put a lot of work into it. The answer of course is that I still don't have a good feel for the "true shape" of parsing. Let alone all of the extra complexities that go into making it practical.

But what I had missed until now about mathematics is that the English statements aren't really English. And I'm not talking about jargon. To be sure there is a bunch of math jargon that takes a while to understand, but I don't think the hard part is the jargon. The hard part is that, just like with the constrained parsing grammars and algorithms, mathematical statements are a constrained form of English. With the parsing we constrain the grammars in order to be able to write algorithms that halt, fit into memory, and run in a reasonable amount of time. With mathematics there is a different goal in mind.

Charles H. Bennett wrote a pretty interesting paper titled, "On Random and Hard-to-Describe Numbers"[4]. It contains two essays, and the first essay discusses the paradox, "the first number not namable in under ten words". This is of course nine words, which means that the first number not namable in under ten words (1,101,121) is in fact nameable with nine words (ie the above quotation).

Similarly, consider Russell's paradox[5] (also mentioned in Bennett's paper).

The point is that a full natural language contains facilities that allow you to make statements which do not actually make any sense. Even mathematical languages like naive set theory, allow for statements which don't make sense.

Which is why I think mathematical statements are so hard to make sense of. It's because a great deal of work was done in order to make sure that the statement made sense.

[1] - http://stackallocated.blogspot.com/2013/08/cognitive-complexity-classes.html

[2] - http://stackallocated.blogspot.com/2013/08/cognitive-complexity-classes-shapes.html

[3] - https://www.youtube.com/playlist?list=PLWbnIo7XnOkw7zZu6u3si3at21r534qIM

[4] - [pdf] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.27.3492&rep=rep1&type=pdf

[5] - http://en.wikipedia.org/wiki/Russell's_paradox

Thursday, August 15, 2013

Cognitive Complexity Classes: Shapes

In order to expound upon the different types of Cognitive Complexity Classes (CCC), further attention must be applied to the types of shapes which algorithms may be composed of. This is unfortunate because my current concept of a "shape" of an algorithm is overly vague. So, to be able to focus on what makes any given CCC different from any other given CCC (or for that matter what makes any two CCCs the same … or even what types of sameness are needed to support this system), I first need to flesh out the idea of algorithmic shape.

Shapes on the cave wall

So my initial description of an algorithm shape invoked the imagery of the input and output of a function. While that might have been a good way to break the ice, that description is hardly ideal. After all, if you have a function that takes a graph and then returns the exact same graph, it would be intellectually dishonest to describe it as having anything other than a scalar shape. Similarly, if you have a TreeMap function that maps all the elements in a tree using a given function and then returns the same tree with different elements, that seems a lot like a List shape. You are doing operations on trees, but the nature of the operation is very linear in nature. Given a function that traverses an entire tree and a function that traverses an entire list, the Map function is nearly identical between trees and lists (or for that matter scalars and graphs).

Comparison sorts are a pretty good class of algorithms to consider. If you create a Merge sort algorithm that recursively breaks apart the list you want to sort and then glues it back together in the desired order, then you will end up with a tree shape across time. On the other hand if you create a Heap data structure and then insert and remove all of the elements in your list, then you will end up with a tree shape across space. This leads me to believe that comparison sort algorithms have a tree shape to them.

There's two interesting concepts that emerge from algorithmic shape that I want to discuss. The first one is that different programming languages have different "dimensions", which are determined by the features available in them. So I assert that a programming language that has recursion has an extra dimension than a programming language that does not have recursion. If you want to implement a comparison sort algorithm in a language without recursion or you want to implement it without recursion, then you have to "flatten" out the comparison sort algorithm. With recursion the algorithm can take its natural shape; without recursion you have to figure out how to represent the algorithm's "true form" in a flatter context.

Now I'm kind of eluding to something that I'm sure is going to end up getting me in a lot of trouble. Best case is that I escape by being painted overly naive, ambitious, and/or idealistic. So, flattening representations isn't that big of a deal. We do this all the time. We flatten 3D geometric shapes so that they can be represented on a 2D piece of paper. We represent 2D matrices as an array of 1D arrays. I'm sure that "flattening" an algorithm's representation is something that can be formalized. Now on the other hand, "true form" is clearly getting way to close to philosophic digressions or even worse poetic ruminations; I really don't see how I can hope to get away with that one without something concrete or at the very least wildly convincing.

So, Yuri Gurevich has a very interesting series[1] on Microsoft's channel 9 research website. I'm sure I haven't devoted enough time to fully digest everything he said during that series, but a couple of things did stick that I think spell trouble for the direction that I'm hoping to go in. Turing argued very convincingly for Turing Complete functions. But he didn't argue at all concerning Turing Complete algorithms. So while essentially every programming language is turing complete means that we can compute any function in every programming language, it does not mean that we can implement every algorithm. And of course this makes sense if you think about it. In the C programming language you could write an algorithm that writes a bunch of data to a file, calls a Unix command line utility to process that file, and finally reads the result from the Unix command to complete the algorithm. Implementing this algorithm in lambda calculus is impossible. Lambda calculus doesn't even recognize that files exist let alone have facilities to call out to Unix command line utilities.

Professor Gurevich, if I recall correctly, argues that the class of digital algorithms cannot possibly contain the class of all algorithms. Analog algorithms for example make use of reals, which can't be contained in natural numbers let alone digital memory. Not to mention any all encompassing theory of algorithms would have to contain all future human invocation (who foresaw quantum computers back when the Greeks were first coming up with GCD).

I'm fairly sure that I'm some flavor of constructivist[2], which I believe Professor Gurevich wasn't too keen on in his video series. And I suspect that this might explain my idealistic "true form" sentiments. After all if the only things that *really* exist are things that can be constructed, why all you have to ensure is that the Comparison Sort Problem Constructor is only called one time and it is going to be pretty hard to argue that the result is anything other than the "true form" of comparison sorts. Of course in that last sentence I just spilled the beans as to why the road ahead of me is likely going to be a tough one, but let's get back to that in a little bit.

First though, let's get to that second interesting concept that I mentioned quite a bit ago. Let's go ahead and assume that my algorithmic shape idea has all the merit in the world. And let's assume that programming language features really do correspond to different shapes. In that case, if statements and simple pattern matching seems to correspond pretty closely to scalar complexity. Loops, map, fold, unfold, et al seem to fit nicely with list complexity. Recursion and algebraic data types are pretty clearly tree complex. Not quite sure what would correspond with graph complexity. Some sort of state machine maybe? Well at the very least an actual graph would be graph complex. Of course we've got an interesting progression here.

Lists are a generalization of scalars. You got one thing, now you got lots of things. Trees are generalizations of lists. Graphs are generalizations of trees. And hypergraphs[3] are generalizations of graphs. Now in my previous essay I suggested that maybe the most complex thing we have to worry about is graph complexity because you can implement lambda calculus (a turing complete language) with a graph reduction machine. But on the other hand, let's go with the flow for a minute. What kind of programming language features might show up to make graph complexity as easy to reason about as a Map operation with parametric polymorphism. What about hypergraph. There might be some very powerful abstraction techniques out there, and these complexity shapes might provide some insight to developing them.

Constructivism and the overly poetic imagination

So mathematical constructivism is probably my type of thing. Turing completeness kind of gets in the way of a constructive view of things. My thinking goes kind of like this: If the only things that exist are things that can be constructed, then equivalence is just a hop, skip, and a jump away. Just trace things back to where they were constructed and if it's the same constructor, then you're in business. Unfortunately, with turing completeness we can't even know in general if an arbitrary function halts [4,5]. Forget about showing equivalence between two of these indeterminately halting beasts.

With two turing complete functions, we can't determine if they're the same thing. But who's talking about functions? I wasn't. I was talking about algorithms. Which are an even weaker concept than functions. At least we have a definition for turing computable function. Algorithms are pretty vague and tomorrow a new class of them might show up out of nowhere.

But then again. Who's talking about algorithms? I wasn't. Not really. In my constructivism heart I was talking about some sort of "true form" not for algorithms, but for problems. Can't have a true form without equivalence. At least if you want to stay outside the realm of poetry. I mean how would you show that something was in it's true form if you're not equating it to *the* true form. And in turing complete land you can't even have function equivalence. Let alone algorithmic equivalence. And turing complete algorithms aren't even a thing. How do you have a true form for something that isn't even definable. Now I'm talking about Problems. Like not even implementations of algorithms for something. But some sort of "higher plane of existence" where the Problem itself is a concrete thing. Pretty ridiculous.

Although … on the other hand … turing completeness kind of sucks. I mean don't get me wrong. All things being equal, it's a pretty nice safety blanket. After all you wouldn't want to be stuck up the river without a turing complete paddle to implement that function, which turns out can only be implemented in a turing machine. However, a function that can be generically shown to be total is a function that is a lot easier to understand. If you had a choice between a function that you knew was mathematically shown to always return a value no matter what value you pass to it and a turing computable function that had a "quite impressive" 90% test coverage, you're going to choose Mr. Turing-Complete-Is-For-Chumps. Granted I might be overestimating the typical programers fear of ⊥, however I like to think the informed decision is to ditch turing completeness when you don't actually need it.

So perhaps my Problem as a concrete object isn't as far fetched as it sounds. You do have to get rid of turing completeness. Or at the very least have a subset of your workspace reserved for provably halting functions. Hmm. Homotopy type theory[6] supposedly says some interesting things about equivalence. I guess it's time to hit the books.

[1] - http://channel9.msdn.com/Series/C9-Lectures-Yuri-Gurevich-Introduction-to-Algorithms-and-Computational-Complexity/C9-Lectures-Algorithms-with-Yuri-Gurevich-Introduction-and-Some-History

[2] - http://en.wikipedia.org/wiki/Constructivism_(mathematics)

[3] - http://en.wikipedia.org/wiki/Hypergraph

[4] - http://en.wikipedia.org/wiki/Halting_problem

[5] - http://stackallocated.blogspot.com/2013/02/halting-problem.html

[6] - http://homotopytypetheory.org/

Wednesday, August 14, 2013

Cognitive Complexity Classes

In a Galaxy far far away

LANGUAGE WARS

When I started my professional career as a software engineer (six years ago yesterday actually), I really only knew a couple of languages. I received my Bachelor's of Computer Science at Purdue University. While the breadth of topics offered at Purdue is certainly impressive, the programming languages covered certainly leaves some room for improvement. I had an introductory course in Java and then everything else was pretty much C or C++. Granted there were some alternative possibilities that cropped up from time to time.

For whatever reason, beginning a career in software engineering really got me interested in programming languages. I did some research on my own previously, but actually programming for a living filled me with a sense of professional pride. I ought to understand the tools that I'm using to create products. Of course I've always been interested in understanding how things worked. Maybe it was only natural that this would overlap with my work.

I began trying to figure out the features of programming languages. So dynamic typing, eval, meta objects, object oriented, lisp macros, pattern matching, automatic theorem proving, the works. Type theory took awhile to figure out. After all there's a bunch of math involved, and it's not really the kind that I had prior exposure to.

Through all of this a couple of weird issues kept making an appearance. Turing completeness and the power of more expressive languages. Any turing complete language is able to compute all of the functions that any other turing complete language is able to compute. Nearly all programming languages are turing complete (and none of them are "super" turing complete), so the question naturally arises … "why do we need different programming languages?" Out come the vague, poetic answers. You can work better or faster in such and such a language. Or this language has features that prevents certain mistakes. Programers who use language X tend to be better because they are more passionate about programming. And there are many such ideas like this. Some tied to conduct studies by giving students without prior exposure to programming different languages to complete a specific task. But it's not really clear how this applies to experienced practitioners. After all, shouldn't knowing both languages in a given study be better than just one. And how do abstract, language agnostic, problem solving skills factor into any results?

It's Complexity, but not as we know it

Kind of a staple of a computer science education is algorithm complexity theory. Big O notation and its ilk. I'm probably being overly cynical, but algorithm complexity theory might be the only real difference between an educated computer scientist and a self taught programmer. (And I know I'm being overly cynical when I say most computer scientists forget all about Big O after the final, but sometimes it's hard not to think it.) Honestly though, the theory and concepts from Big O and its friends don't really show up all that often in practice. But. When you are able to leverage these disciplines your peers will look at you as if you're a dark wizard who wields powers beyond human comprehension. Seriously, all I did was replace a list with an array. Just because it changed the application from being broken and without hope of repair to working better than expected doesn't mean I did anything *that* impressive. This power can be yours as well.

The basics are actually pretty simple. Given some number of inputs to an algorithm, how fast does the memory requirements and time to compute grow with one more input. So if an algorithm's time complexity is linear, then the time to complete a computation grows proportionally with the number of the inputs. Or in other words:

input - time

1 - 1

2 - 2

…

100 - 100

If an algorithm's time complexity is "N Squared", then we're looking at something like:

input - time

1 - 1

2 - 4

...

10 - 100

...

100 - 10000

And obviously there's a lot more that goes into all of this when you start to get more formal. But I think you see the point. For small input's it doesn't really matter what the complexity of your algorithm is. But once you hit large input, it really starts to matter. Maybe 10,000 time units doesn't sound like a big deal, but consider someplace like Google, Facebook, or the IRS. An input of 300 million can suddenly turn into 9E16. Or perhaps I should rewrite that as the more impressive, 90000000000000000. That's kind of a problem.

However, while algorithmic complexity is clearly very important and has an important place in the design of software, it's not the type of complexity that I really want to talk about. I've noticed a different type of complexity. One which I feel answers questions about a lot of personal experiences that I've encountered when trying to implement certain types of algorithms, but which more importantly answers the question of "what's the real difference between programming languages" without resorting to a bunch of subjective statements about productivity.

Instead of (or more likely "in addition to") classifying algorithms by their time and memory complexity, I want to classify algorithms by how hard they are to comprehend. On its face my goal is violating my desire to avoid subjective statements. After all who gets to decide the difficulty of comprehending an algorithm. We don't even have a good definition of what comprehension itself *is* let alone how to prove that someone actually has a comprehension of a given subject. Actually, if I could define and prove comprehension, I wouldn't even bother with my "new kind of complexity". I would just set up shop as a recruiter who could mathematically guarantee hires who actually know what they're doing. And I would retire soon after very wealthy indeed.

So simply by the elusive nature of comprehension, I think my idea of complexity will be a bit loose around the joints. However, while a completely formal definition may or may not be forthcoming or even possible, I believe that I can argue convincingly that there are different cognitive complexity classes for algorithms. With perhaps some fuzzy overlap between the classes, to make room for different types of human cognitive talent. After all there may be some people who really are better at solving a problem "the hard way" due to varying circumstances. Anything from how their brain is wired to the way they were raised to solve problems.

It is very complex. You are likely to be eaten by a grue.

Right now I've got four complexity classes in mind. This almost definitely needs to be expanded and modified, but I think it's a good place to start. The way that I'm classifying these classes is by describing the "shape" of the algorithms.

The simplest class is the Scalar class. Followed by the List class. Then the Tree class. And finally the most complex is the Graph class.

Like I said, I don't think these categories are absolute, final, or complete. I think it's just the starting point.

An algorithm that has a Scalar complexity can be easily imagined as a function that takes a scalar as input and output. So logical-and is a good starting place. Take two booleans and return a third boolean. The structure of the inputs is atomic. There's not that much going on here. It's also easy to see where things get fuzzy. A function that takes an array and then reallocates the memory of that array in order to expand the total size is probably Scalar complex. But a function that takes an array and then maps a function over each value in that array, creating a new array with new values, should probably be in the List complexity class.

The Map (or Select depending on your preferred terminology) function that I just described in my previous paragraph may seem like a Scalar complex function. After all you take a completely opaque function, apply it to every element in a mostly opaque array, and finally return a newly constructed and mostly opaque array. The structure of the list doesn't really matter *that* much. So part of this is the unfortunately fuzzy nature of comprehension. The Map function seems pretty easy to someone who's had a lot of experience with functional languages. However, there are two things to consider. The first is that if you look at the "shape" (admittedly a vaguely defined term) of the algorithm, you'll see that it's pretty easy to convince yourself that it has a "Listy" shape … or at the very least it's more listy than a logical-and function. The second is that incorporating Map into your code in a functional language is very easy to imagine. It yields clean and composable constructs. However, attaining Map like functionality in an imperative language like C is a very different story. A major component of my motivation for this cognitive complexity theory is that it should help explain what the difference between programming languages actually are. List complexity isn't a big deal in functional programming languages with higher order functions, but in C like languages you need to be more careful with it. I'll talk more about language differences later.

So, I've already sort of covered List complexity. But there is some more that needs to be dealt with. Functionality like Map, Fold, and Unfold all seem like they belong to a class of functions that have a list like "shape". However, things unfortunately get fuzzy again when you consider lists of lists of scalars. In some situations you may consider a list of lists to be a tree structure and in other situations the lists could be considered to be atomic in nature. It really depends on the type of algorithm and how comfortable you are with these types of operations. Although, the more operations you need to perform on the inner lists the easier it is going to be to argue the shape of the algorithm as a tree.

Tree complexity. The most obvious tree complex algorithm that I can think of is the Permutation function. A function which returns all of the permutations of a list is very naturally described in a recursive manner. Tracing out the function calls shows what is very obviously a tree. O (n log( n ) ) sort methods also have a very tree like shape to them. After all several of these types of sort methods specifically exploit the nature of trees in order to get their time complexity. Again you can start to make arguments based on different non-recursive implementations of these algorithms, but I noticed something interesting recently concerning Levenshtein distance.

The definition of Levenshtein distance[1] is about as recursive as you can get. Each node of the call graph makes multiple calls to itself. The resulting "structure" over time is a massive tree. However, implementing the Levenshtein distance formula this way is overly wasteful. The way that the function is normally implemented is by constructing a table and then finding the minimal path through this table (actually, until recently this was the only algorithm I knew about for Levenshtein distance). And there's even an optimization which only requires two rows of the table. Now, these alternative implementations don't necessarily seem tree shaped. However, I would like to argue that starting with a recursive definition and then transforming it into a non-recursive definition is akin to converting a recurrence into a summation. So my hope is that it can be argued that the recursive form of algorithms like Levenshtein distance is the "more true" form for purposes of classifying cognitive complexity. At the very least I think it won't be too hard to argue that the non-recursive form of Levenshtein distance is more *complex* than your typical listy algorithm.

Finally, there is Graph complexity. I haven't gotten much opportunity to play around with algorithms that seem squarely in the Graph class, so I'm not really going to say too much here. Parsing certain types of grammars seem like they need some sort of graphical shape or structure in order to function correctly. Additionally, Simon Peyton Jones published an interesting book back in 1987[2]. In it he describes a G machine (graphical reduction machine) that can be used to implement the lambda calculus base for a functional programming language. My naive thought is that if graphically shaped algorithms are strong enough to implement the turing complete lambda calculus, then maybe we don't need anything more complex. Perhaps there is a maximum amount of cognitive complexity we have to put up with before we can show that there has to be a simpler way. I have a feeling that this idea will either lead the way for incredibly useful algorithm simplification tools or turn out to be hopelessly optimistic.

Fight a complexity G specimen without damage control, you're crazy.

A major goal I have here is to show that different programming language are different in more than subjective anecdotes. I think that I have at least highlighted one possible route for a different type of complexity. A complexity that doesn't concern itself with time or space, but instead with comprehensibility. And I have already mentioned briefly how we might use such a definition of complexity to differentiate programming languages.

Basic List complex algorithms don't seem like a very big deal when you have access to functions like Map and Fold. Things begin to become more problematic when all you have access to are for and while loops.

Additionally, Map and Fold can be safely used in a dynamically typed language. However, once you begin to embed lists inside of lists, parametric polymorphism found in languages like Haskell, Ocaml, and C# really start to pay off.

Finally, just because nearly all of the languages available to use today are turing complete, doesn't mean that this is beneficial when the complexity class of an algorithm begins to increase. In a language like C where the main algorithm solution mechanisms of choice are loops and if statements, when Tree or Graph complex problems are encountered the solution will be some sort of state machine constructed in a while loop filled with continue statements. We've seen with the Levenshtein distance algorithm that such solutions can be of high quality. However, my fear is that many engineers will not even be aware of the existence of highly complex problems or their need to be carefully transformed in such a way that they can be correctly implemented. Worse yet, if the only tool that these engineers have a mastery of are loops and if statements, the end result maybe an ad-hoc and fatally incomplete state machine, even in more typical languages like Java and C#. Suddenly, turing complete shifts meaning from incredibly useful property to enough rope to hang yourself.

I believe that the goal of my efforts should be to identify and elaborate on different cognitive complexity classes. Then I can identify which programming language constructs most lend themselves for solving these classes of problems. This will provide guidance for future programming language advancements as well as educational efforts for the best way to use those advancements. Developing this idea into a formal system would be icing on the cake, but I feel at the very least that convincing arguments can be made for the utility of cognitive complexity classes.

[1] - http://en.wikipedia.org/wiki/Levenshtein_distance

[2] - http://research.microsoft.com/en-us/um/people/simonpj/papers/slpj-book-1987/

Friday, August 9, 2013

non-ubiquitously omniversal, the series

Looking around at our increasingly technical world with four years of formal computer science education and 6 years of industry experience as a software engineer, I am having some trouble watching old episodes of star trek.

They have access to incredibly powerful technologies. They can travel faster than the speed of light. Force fields, inertial dampeners, anti gravity FTL communication channels, transportation, who cares about the laws of physics. Perhaps their understanding of physics is just much greater than our own. They're not violating the rules of reality, they're just playing by a much better understanding of those rules. However, if that's the case, then were's Facebook?

Seriously. Well, perhaps you say that there' so advanced they no longer *have* Facebook, ha ha. But then where're the IDevices™. I'm not sure that the data pads you sometimes see people carrying around count either. They're kind of clunky. And you routinely see them piled up on desks. Seriously dude, I think one pad plus some 30" monitors would do it. Or what about this. Junior officer announces a ship uncloaking. Long pause. A console explodes violently (maybe you shouldn't put a high power cable behind every monitor … seriously you would think they fix that after the first time someone dies due to monitor explosion). Raise the shields! How about you just program the computer to raise the shields automatically whenever a ship de-cloaks. Ta da, thank me later. Everyone seems to have these passwords that are like three letters long … which are occasionally defeated by tape recorders. There was that one episode where Data takes over the entire Enterprise because he can mimic Picard's voice. Everyone just has to sit around and is completely helpless for the rest of the episode. How about some PGP guys. And that's another thing. It seems like there's way too many plot points were some simple crypto would have prevented everything. Or there's too much crypto being broken at the drop of a hat. But whenever the plot needs something to be secret they just duck into an empty room or piggy back data on a modulated inverse tachyon beam. Like as long as no one paying attention you're secure. Seriously, today I could buy 20 raspberry pis, glue a microphone on them, and *bam* I can bug a large facility for under $1000. And as far covert subspace channels go. An introductory book on data mining and a small fleet of very small probes and no one's ever sending anything covertly ever again.

The crypto / covert dynamic is kind of weird though, right? Like what reality is it where crypto is trivial to just crack every other day. But covert channels is the defacto way to get real work done. Hmm.

Well, in a universe where you have nearly infinite computer processing power, a lot of crypto is going to be worthless. I mean crypto works because either you have some math problem that would take too long to solve the hard way or because it would take too long to brute force the key. That all falls away with infinite processing power. So crypto is worthless, but then why don't they have drones everywhere constantly monitoring the background radiation levels and automatically throwing the result into some sort of digital processing algorithm. Heck you could even data mine the signals against a giant database so you automatically adapt to new strategies. That should get rid of any covert channels. No sneaky stuff for that universe. Maybe they're too busy to do it. Like it would be too much work … although I could probably bang something out in python pretty quickly. Like … maybe they don't have python. All of that might be a huge pain in the neck if all you had was assembly … err … they do talk about subroutines an awful lot. And no one ever mentions higher order functions or even modules.

It sounds suspiciously like in the star trek universe, no one ever discovered lambda calculus. Or any other higher level maths that would lead to the development of better software engineering techniques. They still invented computers, but they're programming them in the most basic way possible.

Why wasn't the math we know about today discovered? Well it was probably because all of the mathematicians were busy with the discovery of whatever magical physics defying stuff that allows them to travel faster than light and generate gravity. This is actually the premise of a short story called "The Road Not Taken"[1]. Basically, aliens invade Earth, but they do it with flint lock pistols and cannons. Turns out that the secret to faster than light travel is really simple, but it's totally nonsensical. So once a race discovers it science can't explain it well and proceeds to collapse. Whatever technological level you were at when you discover FTL is where your civilization stays.

So it looks like Philip Wadler's Omniversal programming language isn't as ubiquitous as he hoped. And I guess that sort of makes sense. Who wants to compute lambda terms or find out how to decrement natural numbers when you could play with infinite computation power and travel to the other side of the universe. Heck, most people would be more than happy to just sit around in the holodeck.

Although, that also means that if a copy of SICP or the Lisp User Manual shows up in the star trek universe, then much havoc would be wrought.

Hm. Kind of like the Death Note anime. Only instead of bored Japanese death gods you would have ghost John McCarthy following around some sort of intelligent but fatally overly idealistic and arrogant star fleet cadet.

... that ... that actually sounds like it would make for some compelling tv.

[1] - http://en.wikipedia.org/wiki/The_Road_Not_Taken_%28short_story%29

Thursday, August 8, 2013

All worlds but one

There have been several vastly important revolutions in human knowledge in the last hundred years or so. Global communication changed how we see the world. DNA and the human genome project changed how we see life. General relativity changed how we see space and time. But the one I'm interested in today is the one that most people don't even realize happened. In the early 1900's there was a revolution in mathematics. And it changed the way we see everything. What started with the likes of Bertrand Russell, Gerhard Gentzen, and David Hilbert ended with the likes of Kurt Gödel, Alonzo Church, and Alan Turing. Most people didn't notice a difference in the thoughts and methods of abstract, ivory tower academics. And I doubt they would care even if it they did. But they do notice the IDevice™ in their pocket, the computer that does their taxes, and the massive outgrowth of automation that governs more of our daily lives by the minute.

And I don't quite mean that the mathematical revolution gave us computers. Computers were on their way regardless of what mathematics was up to. Mathematics gave us turing completeness and lambda calculus. I also don't quite mean that our current technological wonderland is due to turing completeness and lambda calculus. Most of the developers responsible for our current lifestyle are only vaguely aware that such things exist. These developers learn the bare minimum and then plow relentlessly ahead; following their imaginations. However, imagination will only get you so far. Eventually, overly ambitious plans must cede to the limit of human comprehension in the face of overwhelming complexity. Mathematics, however, provides avenues to tame this complexity. And it has been sinisterly added to even the simplest developer's toolbox. Why the honest and practical minded developers of today would be shocked to find that they have been making use of monads and theorem provers for years. But they have. And our increasingly technical world would not exist if this wasn't the case.

Philip Wadler is a mathematician who has a certain interest in lambda calculus and computation. He came up with a marvelous phrase … lambda calculus is the omniversal programming language. You see, it might be said that lambda calculus is a universal programming language. Being mathematics, any civilization in the universe could independently come to the same conclusions that Alonzo Church did in the 1930's. But it goes further than that. Because it's mathematics, it should be able to also exist in other universes. So it is truly omniversal.

Of course, personally, I have some reservations about an omniversal programming language. While I can't imagine the nature of a universe where mathematics does not function, I can imagine that such a place might exist. Lambda calculus itself argues against its status as omniversal. In lambda calculus, things are not allowed to come from nowhere. There *are* free variables, but they are really only free in a local, constrained context. There must be some larger scope which introduces the free variable, or else it is an absurd entity without any meaning. If in the world of lambda calculus everything must come from somewhere, then where did lambda calculus come from. Surely it cannot exist of its own nature by its own nature for its nature is that this cannot happen. So it points to a world outside of itself; where it does not care to venture.