Reading Building Robust Systems, by Gerald Sussman. I'm not done yet, but so far (page 7) the paper is pretty vague and metaphorical. So I thought I'd scribble down some of the slightly-more-concrete thoughts that it triggers in my head:
More...Here's a very interesting paper, Extracting Queries by Static Analysis of Transparent Persistence, by Ben Wiedermann and William Cook. The idea is that instead of constructing a SQL query and then iterating over the results, you just write a program as if all the records in the database are available, and the language automatically figures out what (efficient) SQL query to construct. If-statements are automatically turned into where clauses, and reference traversal is automatically turned into a relational join. Unfortunately the paper doesn't cover the more interesting cases of aggregation or grouping.
More...Here's a simple Haskell JPEG decoder. The cool part about it is that the web page that describes the program is the actual source code -- it's a literate haskell program that happens to also be a web page. The program doesn't handle color images, so it's more of an executable tutorial than an actual library for handling jpegs, but it's still pretty cool. I love the fact that the program's "comments" have embedded images.
Here is a wonderful post by Andrew Pimlott, which introduces monads as monsters. Very appropriate for Halloween. Snippet:
(>>=) should be read as the monster on the left expelling values through its rows of teeth and over its tongue at the function on the right. (Its pronunciation is a sort of bestial hissing that is difficult to describe; when you say it correctly, the terminal may become slightly moist.)
There's a guy you may have heard of, named Shriram Krishnamurthi. He's the author of the classic presentation at LL1, The Swine Before Perl. He has also written a free online textbook, Programming Languages: Application and Interpretation, plus scores of papers on programming languages, security, and verification. He's the senior advisor for FrTime, a functional reactive programming language in Scheme, and Flapjax which is the same thing for Javascript. He also happens to be my advisor at Brown.
He now has a blog. You may want to check it out.
The Flapjax language has just been publically released. This is very exciting, for many reasons.
First, Flapjax is globally persistent. The language comes with built-in server-side storage, and you don't even have to provide the server. User-based ownership is also built in, and data can be shared between users. So you can easily build little web services without having to deal with all the issues surrounding user authentication, email confirmation, secure data sharing between users, building a server farm, etc.
Second, Flapjax is reactive, which means it has a cool dataflow-like, spreadsheet-like mechanism that makes it easy to work with dynamically-changing variables (called behaviors) without having to write a bunch of glue code to propagate changes. For example, you can display the current time, and the user interface will just automatically update whenever the time changes.
Third, Flapjax is web-based, so Flapjax programs are immediately relevant to the modern world. Flapjax grew out of a project that was based on Scheme, and the Scheme part alone pretty much doomed that previous project to irrelevance.
Fourth, Flapjax is built by the same group that I've been working with for the last year or so at Brown, so it's exciting to me personally. I'm currently writing a paper on how to do static optimization for the previous Scheme-based version, and I'm glad to see the ideas live on in a new, more accessible incarnation.
So now you should go read through the Flapjax tutorial. It's entertaining, englightening, and if you're anything like me, it may rekindle your spark of curiosity and excitement.
My computational biology class requires us to use Mathematica. I'm not very familiar with it yet, but I've discovered a couple basic design flaws that make me extremely uneasy using it.
More...I've spent the last couple weeks trying to what should have been a simple task, and I feel the need to publicly whine and complain about how hard it has turned out to be.
All I want is to fully expand a mzscheme module definition and then do some syntactic rewriting on it. I expected to be able to write something like this:
(define optimize (λ (code) (rewrite (expand code)))
Oh, how naive I was...
More...I think it would be an interesting project to make a language that refuses to run any code that isn't justified by tests. This would basically enforce test-driven development. What I'm thinking of is something like Jester, but built into the compiler (or interpreter).
More...If you're a CS person and you've ever wondered why formal logic matters to you, here are two answers that I've found so far.
More...In visual representations of object oriented software designs, people traditionally draw subclass relations using an arrow that points from the subclass to the superclass. This seemed strange to me when I first learned it -- intuitively it seemed like the arrow should go the other direction. I remember reading a justification somewhere, based on the argument that subclasses "knew about" their superclass, but not vice versa. That argument always seemed weak to me.
However, I just realized that the direction of the arrow actually corresponds to implication in logic. In particular, the Liskov Substitutability Principle is exactly the same as Modus Ponens.
Cute.
Random thoughts about logic, computability, etc that I was pondering tonight while in the shower.
More...The other day I had a realization that suddenly made syntax-case seem a lot less magic.
Suppose you're defining a Point class, and you're trying to write the subtraction operator. What type should it return? Another Point? Or should you invent a new class called Vector? Vectors and Points have exactly the same machine representation, so it seems kind of a waste to create two classes. And yet you have this vague feeling that Vectors and Points really are different, and it might be a good idea to maintain different types for different things. Who knows, maybe it'll catch a mistake in your math. On the other hand, maybe you'll end up having to convert between them all the time, and it'll just end up being annoying rather than helpful.
Here's another example that's a bit more clear cut. Suppose you're working with a version control system, and you want to represent both documents and the differences between documents. In theory both of these could be represented by strings (for example the output from the unix diff command is just plain text). It makes sense to subtract two documents to get a diff, and to add a diff to a document to get a new, patched document. But it makes no sense to add two documents, or to subtract a diff and a document. In this example, it's pretty obvious that you want to keep the types separate, even if the physical representation is the same.
If you've ever found yourself designing a program that has these kinds of concepts, you might want to read about torsors. They're a concept from math that generalizes the idea of things as opposed to differences between things. My favorite quote from that article:
An affine space is like a vector space that has forgotten its origin. A torsor is like a group that has forgotten its identity.
To me, the interesting thing about the concept of a torsor is that it makes it clear that there's a mathematical justification for why addition and subtraction (or multiplication and division) might not have the same types. I love stuff like this. It makes me want to write a Torsor type class in haskell, just because it feels right, even though I have no immediate use for it.
The curry-howard isomorphism gives you mapping between formal logic and programming language type systems. There are many different kinds of formal logic, all of which correspond to different type systems.
Logic is all about what you can prove, and so type systems based on logic are all about properties that you can prove about the programs that are typeable in that system. In particular, you're usually interested in 1) whether a program can be typed at all (i.e. whether it obeys the rules of the logic), and 2) what kinds of things you can say about typeable program (e.g. can it "go wrong" by, say, accessing a null pointer).
I'm wondering whether there's a similar isomorphism between something like number theory and type systems. Unlike logic, the point of number theory isn't so much about making sure you follow valid rules of inference. The point of number theory is more about encoding, and discovering, structure. As a simple example, I could imagine using multiplication of primes to represent cross products of types. If you did that, then what does it mean to say that factoring of primes is hard? Is that the same as saying that it's hard to automatically split a data structure into two orthogonal data structures with meaningful operations on them in isolation? In other words, that it's hard to come up with an automated meta-divide-and-conquer algorithm?
My question is whether there are useful insights from number theory that could be applied to programs. For example I just came across quadratic reciprocity. It's a truly strange theorem. If the components of the theorem (primes, squares, modular arithmetic, congruence) could be given a programming language interpretation, what would quadratic reciprocity mean in that interpretation?
This is the same kind of question as asking what gaussian elimination means in terms of a geometric interpretation.
It seems likely that a program interpretation of number theory wouldn't be like a type system at all. Maybe it would be about computable functions, or runtime complexity, or whether an algorithm can be automatically parallelized.
I only picked number theory for this example because 1) it seems concerned with structure and I like structure, and 2) there many many surprising results in it since it's been around for so long. I could equally well have picked differential equations, or topology.
I hacked together a javascript function to make it easy to write inference rules in HTML. For example, to generate this:
| a→b b→c | |
| HYP-SYL | |
| a→c |
I just have to write this:
Mike Matessa has a cache of the Dutil-Dumas that were sent into space by Encounter 2001, intended to be read by aliens. If you like patterns and puzzles, they make fascinating reading.

Via Shae Erisson
Gilad Bracha writes about the new java verifier that Sun is working on. Interesting pieces inclulde explicit type annotations for temporaries, and no more use of the jsr and ret instructions. Anyone who has tried to do a data flow analysis of java bytecode will especially appreciate the deprecation of jsr and ret.
I just found this great list of freely-available, online programming language theory books, compiled by Frank Atanassow.
Reading about dependent types led me to intuitionistic type theory, which led me to Heyting algebras, which led to point-free topology, which brought me back to higher-order functional programming.
I suppose it shouldn't be surprising that in a dependently typed world, the metatheory of a system starts looking a lot like the system itself, but it's still spooky. It's like seeing a series of small, incomplete glimpses of a colossal beast, and then one of those glimpses suddenly reveals that the monster is curling back and devouring itself. I assume there's a deeper intuition behind these connections, which I'm not quite grokking yet.
While scanning LtU for articles I had missed, I found Why Dependent Types Matter. I've come across Epigram before, and I was charmed by the idea of programming by choosing a problem-solving technique and then filling in the blanks. However, at that time they hadn't implemented general recursion yet, so my interest level dropped substantially and I decided to ignore the language until it had matured a bit more.
This newest Epigram paper, however, is very interesting. When I got to section 4.1, I had a "Wow!" moment as I finally understood what they meant by "the point of writing a proof in a strongly normalizing calculus is that you don't need to normalize it". This means that they can express statements about expressions and prove that they hold in all cases -- and then the compiler can just rely on the existence proof, without actually having to do any work at runtime.
For example, they prove the trivial rewrite rule "m+(1+n) = 1+(m+n)", via induction on the natural numbers. The proof is simple, but the really amazing part is that because they've expressed the proof in a way that the typechecker can understand, they can use it to rewrite expressions without having to pay any runtime cost. The truly elegant part of this is that the proof is done with almost exactly the same language as is used for regular computation. This really makes the Curry-Howard isomorphism start to feel concrete, to me.
For the weak of heart, they point out that these kinds of proofs are optional -- if you don't want to go through the work, you can just use less precise types, and then programming in Epigram shouldn't be much more difficult than programming in any other strongly-typed functional language.
In section 5, the paper suggests that dependent types can help nail down the specification of a program to the point where the type system will prevent you from replacing subexpressions of the same type (e.g. swapping the true- and false-branch of an if statement). This seems to me to be a static version of tools like Jester, which measures test coverage by searching for code mutations that don't trigger test suite failures. The drawback to tools like Jester is that you may have to run the entire test suite for every mutation -- and with an average of 1 mutation for every 15 lines of code, that can take a very long time. A static version of this seems quite interesting to me, even if it does require thinking hard about how to prove various properties. It's unfortunate, however, that such proofs requiring mucking with the original definition, and changing its type.
If only I wasn't also trying to get through Types and Programming Languages, while also trying to become more familiar with MzScheme and FrTime, while also working full time, I'd love to download Epigram and play around with it a bit. Maybe someday :) I suppose my next step will be to read more about GADTs.
I first played around with scheme a few years ago -- enough to get the hang of things like macros and continuations -- and it struck me as a very small, elegant language. But for the last couple weeks I've gotten deep into MzScheme, and I'm coming to realize that it's actually quite complicated.1
More...Here's a paper describing a very elegant view of graphs as an inductive (aka recursive) datatype.
More...Look what I found. The New England Programming Languages and Systems Symposium Series (NEPLS) has a whole bunch of interesting pointers to research projects in programming languages. They have three meetings a year, and the abstracts from the past 14 meetings are available online.
So, haskell uses monads to do IO. Monads are very confusing to most people. Arrows are a generalization of monads, so one would expect that arrows would be even more confusing. However, once I read the arrows paper, which explains how arrows can model stream processors, arrows suddenly seemed like the most natural thing in the world.
More...Recently at work, I've been part of a committee that is trying to figure out a strategic plan for implementing various features. The features are all big and hefty, with several possible implementation paths, and our goal is to try to avoid the kind of schizophrenic codebase that can result from solving every problem on a case by case basis, with no thought given to the possible simplifications and generalizations. Because this committee is thinking about features that may guide the company for the next year or two, I have to remain vague about what exactly the features are, but I wanted to try to share a lesson that I've been learning. Hopefully the vagueness won't make the lesson incomprehensible.
More...This paper describes a technique for creating composable transactions, based in Haskell. It's an extremely elegant approach, for reasons which are completely unrelated to the fact that it's written in Haskell. Haskell merely allows them to give a static guarantee (via the type system) that the transaction system isn't perverted by the inclusion of non-transactional code.
You may think that "composable memory transactions" are not something that you have ever felt a need for, and therefore this paper is irrelevant to your life. However what the paper is really about is synchronization primitives for multithreaded code -- a topic that I have to deal with all the time. It just so happens that their proposed synchronization primitive has lots of other uses as well.
The authors point out that blocking within a transaction is not composable (due to the potential for deadlocks when other threads wait for locks that the transaction has already aquired), and that if you attempt to avoid blocking by introducing guards on transactions, then that also prevents composability. So instead of allowing blocking, they introduce a "retry" function which aborts the transaction and then schedules it for later execution. As an optimization, they put off rescheduling the transaction until at least one of the variables referenced before the retry has changed. In this way, they have the effect of blocking the calling thread, without locking the transactional store.
In the context of Haskell, the authors simply make retry into a function which returns a special value, which takes priority over any other values it is composed with. This is relatively elegant in the context of Monads, however I think the same effect could achieved in an imperative language by making retry throw a RetryException -- so the technique is not limited to functional programming. On the other hand, imperative languages make it so easy to modify global state that it may probably be difficult to apply the technique reliably to such languages.
(Via Lambda the Ultimate)
I have to thank Eugene Wallingford for commenting on my blog, because he left a link that led back to his blog. He has a lot of interesting musings there, such as this one which was inspired by a keynote talk by Alan Kay:
Kay reminded us that what students learn first will have a huge effect on what they think, on how they think about computing... Teaching university students as we do today, we imprint in them that computing is about arcane syntax, data types, tricky little algorithms, and endless hours spent in front of a text editor and compiler. It's a wonder that anyone wants to learn computing.
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/355045
>>> ss = SpreadSheet() >>> ss['a1'] = '5' >>> ss['a2'] = 'a1*6' >>> ss['a3'] = 'a2*7' >>> ss['a3'] 210
Surana recently learned prolog, and discovered that it's great at problems involving search -- searching for compiler optimizations, for example. I had the same insight about Scheme a few years ago. I decided to prototype Ripple in Scheme. It turned out to be so easy that it popped my enthusiasm bubble, and I decided that the problem was simply that I had been using Java (where dataflow was hard) instead of Scheme (where it was easy). I can empathize with his comment that "Everything has already been done."
On a slightly different note, I was playing around the other night, using Prolog for a slightly different search problem. My problem related to finding potential runtime errors in code, using a technique called "abstract interpretation". I was only working on paper, sketching out various ideas. But after about fifteen minutes I found that the most natural way of expressing what I was thinking about was to write it down using Prolog's syntax and semantics. I suppose the sign of a sweet-spot language is when you naturally fall into using it for writing pseudocode. Considering how often I find myself aching to redesign the semantics of various languages, the fact that I felt perfectly comfortable using Prolog's goal-directed backtracking model is really saying something.
Andrew Cook posted a link to Bruce Axtens' list of unknown programming languages on LtU. What makes this list really useful is that it gives a brief description of what makes each language special.
More...This made me stop and gasp:
Now, even a cursory inspection will show that the function call and return continuation invocation are the same thing. What does this mean? Well, of course it means you can do MMD on function returns as well as on function calls.
Just think of the possibilities! Not that any possibilities are occuring to me at the moment... but just think anyway!
I spent a little time playing around with operator overloading in Python, to create a Python module for doing dataflow programming. I got to the point where I can write code like this:
x = rock(1) fact = factorial(x) print fact.get() x.set(6) print fact.get()
However because python makes it much harder to work with statements than with expressions, the definition of factorial() is somewhat tortured:
one = rock(1)
two = rock(2)
def factorial(n):
return _if(n < two, one, n*rock(factorial)(n-one))
Basically I had to turn the if-statement into an expression. And since expressions are always evaluated eagerly, I also had to wrap the recursive function call in order to make it lazily evaluated. It doesn't bother me that the code is sprinkled with calls to rock() to lift all the constants into dataflow-land, but it does bother me that the if statement got so mutilated. After all, part of the appeal of Python is its clear, indentation-based layout. And expressionification destroys that.
In fact, expressionification gets even worse if you're trying to use an iterative version of factorial. I still haven't figured out how to make a dataflow-capable version of this:
def factorial(n):
result = 1
while (n > 1):
result *= n
n -= 1
return result
The problem, in essence, is that dataflow is based on creative interpretation of expressions. And I can't figure out how to turn statements into expressions, except by using full-blown functions. But it seems that full-blown functions can't have any mutable state. For example, this code gives the error "local variable 'n' referenced before assignment":
def foo(n):
def bar():
n += 1
print n
return bar
foo(1)()
So it appears that I'll have to package up all the state into a struct, and pass it in to the function definition. But that requires so much code that I expect it will completely obscure the intent of the function.
An effort has begun to try to get the JVM to support continuations. If you don't know why that would be a good thing, read this. Apparently it wouldn't be too difficult to make Sun's JVM support them.
Lazy K is one of those esoteric languages that's meant only as a theoretical exercise, or cruel joke, depending on your perspective. It tickles my fancy for a few reasons:
First, it has multiple syntaxes: lisp-style s-expressions, Jot style binary, and Unlambda style backticks. Each syntax looks different. It's like having a whole new language for every day of the week (except monday and weekends)!
Second, my name starts with K.
Third, it's lazy. :)
Fourth, John Tromp, the author, has a delicately balanced sense of humor: "the identity function is a program which copies its input to its output. Since Lazy K's syntax is such that the empty program is the identity function, the behavior of the UNIX cat utility (without arguments) can be expressed at least as compactly in Lazy K as in any other language."
Fifth, it comes with a compiler so that you can actually write Lazy K programs without going insane.
Thanks to Bill Glover for pointing me at Lazy K.
MzVim is a project to embed MzScheme into vim. I use vim all the time, but sometimes I miss emacs' programmability. I'm not sure I'll ever get over the hump required actually install this thing, but I'm glad to know it's there.
(Via Ben)
Yesterday I went to a talk at Harvard, titled "Implementation and Evaluation of a Safe Runtime in Cyclone", by Matthew Fluet.
In this talk we outline the implementation of a simple Scheme interpreter and a copying garbage collector that manages the memory allocated by the interpreter. The entire system including the garbage collector is implemented in Cyclone, a safe dialect of C, which supports safe and explicit memory management...
Greg Morrisett, one of the designers of Cyclone, was also present and answered some of the more advanced questions about Cyclone itself.
Cyclone is a "safe" version of C, and it uses static analysis to catch memory leaks. The talk focused on how to write a garbage collector such that the implementation would be accepted by Cyclone's (strict) memory leak detection requirements. First I'll explain how memory leak detection is done in Cyclone, and then I'll explain why it's not trivial to implement a garbage collector using the scheme.
Cyclone's memory leak detection is based on regions. There are several different types of region, but the relevant kind for this talk was the "dynamic" kind. The main characteristics of dynamic regions are that you can allocate new objects into them incrementally, but they have to be freed all at once in one big batch. Perfect for a stop-and-copy garbage collector.
Cyclone restricts the operations you can perform on regions so it can guarantee that you don't leak regions, and that you don't dereference data within a region after it has been freed. This static analysis is based on the idea of capabilities (unforgeable, uncopyable unique handles). Briefly, all the data in a Cyclone program is statically tagged with the region it belongs to (the tag is part of the type, so "an integer in region 1" is different than "an integer in region 2"). You can't actually *use* the data in a region unless you present the requisite capability to the compiler (each region has its own capability). This is done syntactically, kind of like this:
present (region_capability) {
... data in the region can be accessed here ...
}
... but not here...
Note that I used my own syntax here. The actual syntax used by Cyclone seems less intuitive to me.
The Cyclone compiler makes it impossible to leak regions by making it illegal to let a capability go out of scope. It also prevents you from referencing previously-freed memory by revoking the capability for the region as soon as you free it. It's actually a pretty simple and straightforward scheme, so if it seems confusing it's probably because I've explained it poorly.
The garbage collector that Matthew Fluet implemented uses a simple stop-and-copy scheme. In other words it takes an old heap, traces through it to find memory still in use, copies that the memory to the new heap, and leaves a "forwarding pointer" in the old heap so that the garbage collector can handle cycles correctly.
It's the forwarding pointers that make things difficult. Because all datatypes have an embedded region name which specifies what region they belong to, it makes it difficult to come up with a type name for the forwarding pointers. Matthew had a very amusing slide that illustrated this. You end up with a type like this:
A pointer to a block in the region after this one, which has a pointer to a block in the region after the one after this one, which has a pointer to a block in the region after the one after the one after this one, which has a pointer to a block in the region after the one after the one after the one after this one, which has ...
To get around this, they introduced a new primitive into the Cyclone runtime, which lets you express this kind of recursion. There was some explanation about how they "used an existential to prime the pump", and some other feature which made some of the verbosity in the type signature become implicit... but I didn't really follow that part.
However what I did glean from the discussion was that Cyclone's straightforward memory leak detection algorithm doesn't lead to a small, well-defined interface. Case in point: in order to implement a garbage collector they had to extend the runtime. It was a small function (only 5 lines of code) but nevertheless it was not possible at the user level. In effect, they had to introduce a new proof rule into the type system. That's disturbing to me -- it means the type system is incomplete. And these are smart people, so if it wasn't complete to start with, that probably means it'll never be complete.
What's even more disturbing is that at the end of the talk, Norman Ramsey asked how they would implement a generational garbage collector. And the answer was, "We haven't figured out how to do that yet... it may require dependent types." Eeek! Suddenly I realized at a deep level that advanced type systems are equivalent to mathematical proofs, and that if you start trying to prove too much about such an unpredictable beast as a program, it will resist you with a vengence.
Just imagine trying to explain to a systems programmer that the reason they can't implement a "simple" generational GC scheme is because that would require that the type checker be extended to support dependent types. They'll look at you like you have three heads, and then they'll toss you and your silly little language into /dev/null.
Processing is "a programming language and environment built for the electronic arts and visual design communities. It was created to teach fundamentals of computer programming within a visual context and to serve as an electronic sketchbook."
Given an introduction like that, I expected it to have a much different feel than it does. I expected a domain-specific direct-manipulation model, and I got a thin procedural wrapper. For example, consider this RGB Cube demo. I'd expect the source code to look something like this:
More...
From Darius Bacon:
A is for APL, with ciphers arrayed.
B is for BASIC, for kids and for trade.
C was successful, from Dennis, of course.
D is dead Dylan, entombed in closed source.
E is the language that limits your trust.
F is for FORTRAN and decks under dust.
G is for Goedel, a language of logic.
H is for Haskell's expression of magic.
I is for Intercal made up in fun.
J is for Java left under the sun.
K is APL sans its strange faces.
λ's for Lisp and its fond round embraces.
M is ML so your proofs are all sound.
N is the order of growth we must bound.
O for Oberon's slim binary tersity.
P is for Perl's postmodern perversity.
Q was QBASIC in Microsoft's youth.
R stands for art we were taught by Don Knuth.
S is for Smalltalk and corporate spending.
T is for types and flamewars unending.
U is for UML's silly CASE tools.
V is for Verilog -- hardware by rules.
W is web-code, the big killer app.
XML's X, better known as "that crap".
Y is recursion so self-referential?
Z specifies all that's essential.
If you like alphabets and the arcane rules that go with them, you'll love this page: Examples of Complex Script Rendering. For example I had never heard of "vowel reordering and splitting" which is apparently common in Indic scripts.
I really love alphabets, for some reason. I especially like ones that modify the shape of their letters based on context, like this example from Tamil:

It's just so beautiful. Mongolian is another beautiful language. I love the contrast of vertical lines and diagonal slashes.
Russ Ross posted a question to the LL1 list, asking about any research people have done into allowing higher-level languages to use data structures from other higher-level languages in place. He points out that this is a strength of C -- you can take a pointer to the other language's memory, cast it to whatever type you want, and then go at it. But most high-level languages have their own requirements for how memory is laid out at runtime, and these requirements are almost always subtly different for each implementation.
Luke posted a rant summarizing why game programming is too hard. It echoes some of my thoughts about programming at the domain level (Put it in the Syntax), and visualizing runtime behavior (Visualizing Software Designs).
I used to be very interested in graphics programming, and I too ran into the problem of giving up on projects before I even began, simply because the overhead in getting anything working at all was too high. I didn't want to spend two weeks writing a program that only had a weekend worth of interest. My frustration with how difficult it is to write programs is what initially led me to my interest in programming language design.
Slashdot linked to a long interview with Robin Milner, creator of ML, and known for hindley-milner type inference.
There's an amusing post at Language Log which mentions that in some native american languages, the word for automobile is "it has wrinkled feet". Presumably this is inspired by the shape of a car's tires and/or the tracks those tires leave.
Here's an online text book (or is it a set of lecture notes?) from Shriram Krishnamurthi's Programming Languages class at Brown. It's quite readable. I'm currently reading through the section on Semantics and Types. If you found the Advanced Functional Programming Wiki useful, then you may like these as well.
I just came across this post about the lack of good error reporting in the PLT scheme webserver:
I’m a fan of the PLT Scheme web-server. (I better be or else find another job!) But I get an error message like:Servlet exception: "car: expects argument of type ; given #"And when you have 2000 lines of code across four files, not to mention that it might not have been you who called car directly, you start to think, “will I ever find this bug?”
Could I get a line number, or any additional information from the web-server? No. And the bug I filed was closed without resolution in a record 4 minutes.
Now I don't actually know the real reason why the PLT folks don't want to give a line number with their error messages, but I'm going shoot my mouth off and assume it has something to do with the fact that the exception object they're throwing simply doesn't have that information currently, and has nowhere to put it without changing the object type and thereby breaking existing code. In fact I suspect they're throwing a simple string, and not a structured data type.
In this interview, James Gosling claims that code would be more understandable if it wasn't restricted to plain-text ascii layout. For example, maybe you want to see your code as rich mathematical notation, or as rectangles that represent database tables.
I think this is a good idea as far as it goes, but it misses the real meat of what makes programs difficult to understand. The real problem is that programs are indirect -- they manipulate values that are known at runtime, but are not visible anywhere in the code because the code only shows the static parts of the program. For example, if I'm using a linked list, I won't see anything resembling a series of boxes connected by "next" pointers. At best, I'll see a class definition for a single node, which contains a single next pointer. The only place I'll see an actual list is in my imagination.
The Lightweight Languages 2003 conference happened on Saturday. Here's my take, presented in chronological order. Skip to the parts on bluespec and Father Time for the interesting parts.
I started reading Perl 6 Essentials last night. I got it mainly because Dan Sugalski is one of the authors, and because I like the idea of the Parrot VM.
I've been kinda sorta following the Perl Apocalypses, but it's much more convenient to have a book lay everything out all at the same time. The writing style of the book is also significantly more compact than Larry Wall's rambling, tangential style.
Anyway, I have some initial impressions to share.
On the LL1 list, it was claimed that strict functional languages make it impossible to cache computations without changing the interface. Alan Bowden replied with this bit of Haskell code that defines a memoized definition of fibonnaci numbers.
fib = ((map fib' [0 ..]) !!)
where
fib' 0 = 0
fib' 1 = 1
fib' n = fib (n - 1) + fib (n - 2)
This is a very interesting example of Haskell. It makes pleasant use of the mechanics of haskell in order to curry the !! operator so that it's invisible to the caller. But what I like most is that it manages to present a clean example of how to solve almost any dynamic programming problem in haskell, using only a few lines of code.
Unfortunately, while the code is very interesting, it doesn't really address the original problem of providing caching internal to the implementation of a function. In particular, it's not possible to extend this technique to allow cache eviction.
I want to see more languages that incorporate dataflow, the way spreadsheets do. That kind of execution model is perfect for user interfaces. I once spent some time thinking about a dataflow extension for Java, which I called Ripple. As I recall, Internet Explorer at one point was exploring spreadsheet-style continuously-updated functions for dynamic HTML. This was back around when Netscape came up with their ill-fated (and deservedly so) "layers" hack. I thought IE's data flow feature seemed like a great idea, but I think it died when the market showed no interest (probably because it was too "weird"). Unfortunately I have no idea what keywords to search for to find a reference to this.
Most UI-oriented languages these days seem to be based on events. But it's difficult to maintain invariants using events, unless you program in a stylized way where almost all events call a single "recalculate the entire UI" function. Dataflow would make invariants much easier to maintain, and would make the UI itself more efficient because only the parts that have changed would need to be recalculated.
More...This version of minesweeper written in Oz has a "digital assistant" that uses constraint propagation to help you play, by automatically uncovering squares that it can prove don't have mines, and marking mines where it can prove they exist. All in 600 lines of code.
Constraint programming is good reason to learn Oz. Actually, I'll probably learn Alice instead, because I prefer static typing. I really like Alice's explicit laziness with implicit forcing. Plus it has logic variables, dynamic loading, persistence, GUI support, and a repl (read-eval-print loop). I'm starting to get tingles in my stomach just thinking about it. I used to feel that way about Java, back around 1996 and 1997.
I've previously heard of Prolog, Parlog, and the Japanese Fifth Generation Project, but I never knew how exactly they were all related. This post over on LtU clarifies their inter-relationships, and how they developed into Oz. Seeing things in context like this makes me more interested in giving Oz another chance (I've tried to pick it up twice before, and failed to get hooked).
This kind of historical perspective is especially difficult to achieve unless you were paying attention to the field at the time. I'm glad Peter Van Roy shared it in a forum where I was paying attention.
More...Dan is building a forth interpreter on top of parrot. As one would expect with forth, it's apparently relatively simple and straightforward. I'm tempted to see how difficult it would be to create a joy interpreter...
Peter van Roy, who designed the distribution model for Mozart/Oz, is guest blogging over on Lambda the Ultimate. One of his first posts, Currency-Oreinted Programming, is very interesting, and concisely written. For example, he points out that "Map is a broadcast that collects results, and FoldL is the heart of a concurrent object with internal state (it accumulates an internal state from a stream of messages). "
I was also pleased to read his claim that for objects to be independent, they must be concurrent. He puts into words an intuition I've been developing about how lazy evaluation decreases coupling.
I don't have much to add, other than that you should read the article. I think it's the best of the three he's posted so far.
Monads for Functional Programming, by Philip Wadler, 1992.
After reading this paper, I think I finally have an intuition for what monads are. They're basically a fold over computations: unit is used to transform values, and bind is used to transform unary function application. The fact that a monad can only fold over unary application is, one would hope, not limiting because of currying (but I'm not sure). The monad laws can be thought of as restricting the fold to having semantics similar to function application (respectively: referential transparency, lambda abstraction, and composition).
Marc Hamman mentioned that category theory makes more sense once you've learned abstract algebra. Category theory is referred to frequently in the more theoretical aspects of static type systems, and so I'm interested in trying to understand it better. Therefore I'm learning about abstract algebra.
You may have heard the terms "ring" and "group". These are terms that come from abstract algebra. Here's a good introduction to the intuition behind rings, groups, and fields. After reading that introduction, I find myself strongly reminded of haskell's type classes. Type classes seem much more similar to abstract concepts like rings and groups than to OO-style polymorphism.
I'm still looking for a deeper discussion of abstract algebra and its uses. Maybe I'll end up getting a textbook or taking a class. Google found this online textbook, but it doesn't seem to have enough concrete motivations to get me to tackle it on my own. The "enrichment" section helps, but I'd prefer if the main text were written like the enrichment section.
There doesn't seem to be anything on abstract algebra in MIT's OpenCourseWare Mathematics section, either.
The Historic Documents in Computer Science and Engineering page has links to a lot of papers, plus a few patents. A majority of the links are related to programming languages.
Interesting tidbits: A 1957 German patent describes a technique for parsing arithmetic expressions using a push down stack. Also, pictures of flow-chart templates, for use with pen and paper: