I just came across this musing by Graydon Hoare:
I have an ongoing thesis at the moment which I'm exploring in the programming language literature, which is that programming language features do well (all other things being equal) when they eliminate either distant or dynamic state and replace it with either close or lexical state. the underlying point being that we may favour language features that facilitate copying and modifying small bits of code -- fragments which work in their new context -- as a fundamental programming activity.
Yeah, what he said! I'm now going to ramble through some less-than-half-baked ideas that have been going through my head recently. I hope you'll take what you can, and forgive the rest.
Just a couple days ago at work, inspired by the Objects Have Failed paper by Richard Gabriel, and the C History paper by Dennis Ritchie, I was blue-skying about how one could create a radically simpler programming language. I wrote the following on my whiteboard:
At work about a year ago, I designed a language for coordinating clusters of computers in production, which has been moderately successful. That language was designed to be obvious to non-programmers, and so it has no state, no computation, and only grudgingly allows indirection (although it's surprising how close you can get to simulating computation if all you're given is indirection).
Note that all data structures are a form of indirection, even if they're immutable, like in functional languages. As soon as the data structure forgets where it came from (i.e. where it was created), it has become indirect. In fact, I could go even further: as soon as an immediate, direct concept like a person gets abstracted into a non-immediate concept like a social security number, you've paid the price of indirection. Really indirection is just one form of abstraction, although it's an especially confusing one. Abstraction is confusing.
My blue-sky language had a couple other points on the whiteboard besides the "avoid"s above:
By hierarchy, I was thinking of things like XML and XPath. By associations, I was thinking of things like containment relationships (e.g. between objects and their members), bidirectional relationships (e.g. tables in a relational database), and meta relationships (e.g. RDF). I think you know what I meant by collections.
I think the "isomorphisms vs transforms" part might be incomprehensible without explanation. By "transforms", I meant things like XSL stylesheets which cannot be undone once they're applied, or taking data from a database and presenting it in a UI without without keeping track of where that data came from.
An isomorphism, in contrast to a transform, is reversible. If I have a UI presenting a list of things that came from a database table, then an isomorphism will preserve that relationship, so that when I click on an item in the list, I can immediately traverse back to the database row that the item came from. Since they're isomorphic, you can actually think of them as the "same thing", just with two different representations.
In most modern languages, if I want to traverse from a UI element back to a database element, I'm forced to store some kind of primary key along with the UI elements. If I'm lucky I can use a language that lets me store this extra info as an extra member variable on the UI element -- otherwise I need to create a parallel array to hold the extra info. But it would be more intuitive if I didn't have to forsake the immediately-obvious idea of a "record" and exchange it for the weak, abstract concept of a "record id".
[Aside: This ties in with my belief that an ideal language would not use plain integers to index into lists -- instead it would use a type that is clearly tied to the list that it is an index into. I hardly ever want to do arithmetic on list indices (other than incrementing them to get the next item). So treating list indices as plain old integers is as primitive and brutal as the way that C treats pointers like plain old integers.]
What's the big deal about using isomorphisms as opposed to transforms? I think they help you avoid indirection. And I think indirection is the root of all complexity.
It's interesting that programming-language afficionados, especially the elite ones (i.e. the ones that like functional languages) tend to love adding new features to their languages, but that they do this by increasing indirection and abstraction. Parameterized types, virtual methods, function pointers, closures, continuations, etc. All of these things make programs harder to understand. Yes, they add a lot of power, but I think we should focus on making programs more powerful without having to resort to more abstractions to do it. We should be gaining power by removing layers of abstraction, rather than by adding more layers to cope with the inadequacies of the existing layers.
Amusingly, a lot of programmers hate lisp-style macros because they think they make programs harder to read. I completely disagree. I think macros decrease the level of abstraction in a program. They make the syntax more directly-meaningful, closer to the problem domain. Which brings us back to the quote by Graydon I started this post with.
Posted on August 21, 2003 07:44 PM
More languages articles
I think there is no single person that has a clearer view of Lisp macros that Paul Graham. He says that Lisp macros *increase* the level of abstraction (which is a *good* thing).
Perhaps we disagree on the definition of "level of abstraction"?
Posted by: Alex Peake at August 23, 2003 04:07 PMI think we agree on the definition of abstraction, but we're coming at it from opposite directions. You're talking about abstracting away from the implementation details, which is a good thing. I'm talking about abstracting away from the problem domain, so that the the program works only with the platonic ideals of integers, strings, and arrays. Which is a bad thing.
Programmers are frequently encouraged to do both kinds of abstraction, without realizing that the direction matters. I think you should only use abstraction to get closer to the problem domain (the "real world").
Here's a map. Go left.
real world -- program -- language -- machine
Some people have known for a long time (since before I was born) that abstraction away from the problem domain is bad. These people advocate writing programs in a "bottom-up" style. Sometimes they go even further and advocate creating a domain-specific language, or "little language" first, so that you can write the actual program in a language that has non-abstract concepts for all important features of your domain. This is a common strategy in the lisp world.
Now consider macros. Macros let you create a bit of custom syntax and abstract away the implementation details of how it actually works -- macros help abstract away the machine, which is good. But if you don't have macros, you have to write your concept in terms of objects, pointers, data structures, etc. You're forced to translate the problem into the concepts provided by the programming language. This is bad.
When you think to yourself "I'll identify web pages by their URL, and I'll represent the URL as a string", you're abstracting away from the problem domain. Most programming languages encourage you to do this -- they force you to type a lot more code to express your concept directly.
When people talk about how important it is for a new language to have a large standard library, they're really saying that it's important to have non-abstract concepts built into the language for lots of real-world things. The java.io.File class is still abstract (it hides platform-specific differences), but it's a lot less abstract than just a string that holds a file name.
I could go on, and explain why I think that it's better to program closer to the problem domain, but as they say, "that's another story". And one which you probably already know.
I think we are essentially in agreement. We believe in abstracting the machine and even linguistic details and focussing on the problem domain.
However, your statement:
"But if you don't have macros, you have to write your concept in terms of objects, pointers, data structures, etc. You're forced to translate the problem into the concepts provided by the programming language."
I believe is stopping short of the possible. In an OOP world, for example, we use objects and methods to provide our *good* abstractions. Objects and methods are *abstract* if written well.
I don't understand what you mean by "avoid computation." This is after all, about writing programs that run on a computer, and a computer does "computations."
Unless I'm missing something.
And a lot of what you are talking about in this entry sounds very similar to what Forth programmers talk about when writing code. If you ever come across a book called "Thinking Forth" (by Leo Brodie) it would be well worth your time to read it. Even though I don't do any Forth programming, it was one of only two books I've read on programming that actually affected the way I write programs (and enforced my dislike of late binding, but that may be due to my thinking that execution speed is still paramount in programming).
Just when I think I'm about to make a pithy comment, somebody beats me to it. I once heard Forth described as "Defining words until the programmer can write a sentence that describes and is the program." But Forth, like Lisp is more an implementation language for the sort of DSL you're talking about. Blue-Sky sounds like a paradigm rather than any one, specific language. (When encountering a domain, create a programming language specifically suited to expression ideas and concrete "imperatives" about that domain--or something like that. I'm sure it's been said better elsewhere.)
I'm also vague on what you mean by "computation." Is that computed: jumps? relationships? values?
Posted by: Bill Glover at August 25, 2003 06:00 PMAlex, on the contrary, I think limiting yourself to objects and methods is stopping short of the possible. You may be used to them, and so they may seem intuitive, but they really aren't. They're shoes that you have to horn your program into, just like all other programming language abstractions.
I'm not clear what you mean by saying that objects are "abstract if written well". Do you mean that they limit the number of assumptions they make about the rest of the problem? Do you mean that they're reusable?
Sean and Bill, what I mean by avoiding computation is that you should avoid constructs where the behavior of the program is not clear statically. If you have to run the program in your head to figure out what it might do, then it's less clear than it could be. Virtual methods are one of the most common examples of this kind of thing, but it's not just limited to flow of control -- it also applies to values and data structures.
I think the comments about forth and lisp are pretty close to what I'm talking about -- they both have such lightweight syntaxes that they approach "putting it in the syntax" merely by virtue of leaving the syntax so free. Forth has an even more fluid syntax than lisp (ignoring reader macros). I don't have much forth experience, but I think after that recommendation I'll play with it some more.
Bill, you're right that at this point, this is more of a paradigm than a language. But it all began with thinking about what makes a program clean and intuitive, and then thinking about how a language can encourage that kind of program. I think a lot of programs tend to encourage the programmer to map their problem to the structures provided by the language. While I recognize that this is frequently the only way to solve the problem, the main goal that I was trying to achieve was to figure out how to keep from encouraging people to do that.
The trick (and it's a colossal trick), would be for the language designer to get rid of every construct that takes a programmer away from the direct, tangible parts of their problem. Syntax is one of the biggest distraction in most languages. But even when the syntax is flexible, nearly all languages still have an underlying execution model that you need to conform to (cons-cells and lambdas in lisp, stacks in forth, messages in smalltalk).
The concept of an execution model is an example of the kind of thinking that I was trying to get away from. If macros alleviate the problem of fixed syntax, then what would alleviate the problem of a standard execution model?
By the way, "Blue Sky" is a great name. Thanks.
Posted by: kim at August 25, 2003 09:39 PMAbstraction
-----------
"I'm talking about abstracting away from the problem domain,
so that the the program works only with the platonic ideals of integers, strings, and arrays."
Your definition of abstraction is an uncommon one (I've never seen it before).
It's so uncommon that most people seem to miss that it's in the opposite direction
to what they expect; see:
http://www.sauria.com/blog/2003/08/23#507
(where Ted Leung assumes that increasing the level of abstraction takes you closer to the problem domain.)
"Parameterized types, virtual methods, function pointers, closures, continuations
... make programs harder to understand. ... we should focus on making programs more powerful
without having to resort to more abstractions to do it.
We should be gaining power by removing layers of abstraction ..."
(This is quite confusing. Here you talk about things like parameterised types and virtual methods
as adding abstractions, which will take you further away from the machine (and further away from
the platonic ideals of integers, strings, and arrays).
I think that further away from the machine means closer to your problem domain.
This is in contrast to you discussing abstraction as taking you away from the problem domain,
which you do in this same paragraph.)
Anyway, I totally disagree. Higher-order functions (function pointers and closures),
parametric polymorphism (parameterised types), and inclusion polymorphism (virtual methods)
are useful abstraction mechanisms that allow us to express ideas
in code that is closer to the problem domain.
Closer, that is, than code for programming languages that lack these features.
Computation
----------
When you say no computation, what you mean is:
"... avoid constructs where the behavior of the program is not clear statically.
If you have to run the program in your head to figure out what it might do,
then it's less clear than it could be."
Can you give some examples of programs that are not clear statically,
and how you would change them so that they are clear statically?
Or, perhaps, some examples of programs that are clear statically, and do something useful.
Consider a parser, which builds an abstract syntax tree at runtime.
The tree it builds obviously depends on the parser's input,
and cannot be predicted statically.
How would you write a parser while "avoiding computation"?
I would like to see examples of the cluster coordination language you designed,
so I can get an idea of how a language "without computation" would look and would be used.
All I can imagine at present is that a language without computation would be like
specifying a static data-structure at compile-time.
Once it's created, you can't do anything with it.
Alistair: the only problem domain that I know of that involves "Higher-order functions, parametric polymorphism, and inclusion polymorphism" is compilers. If you are forced to use any of these things in order to express your problem, then your programming language is forcing you to express your problem in abstract terms. I am using the word abstract as it's defined in dictionaries:
You are correct in pointing out that code that uses these concepts can frequently be clearer and more direct than code that lacks these concepts. For example, see my recent post about how it would be nice to have asynchronous operations built into C# at the syntax level. But just because sophisticated abstractions are better than primitive abstractions doesn't mean that we should be satisfied with requiring the programmer to reconceptualize their problem in terms of functions, polymorphism, collections, etc. These concepts are really irrelevant to the problem at hand, except insofar as they are the best techniques that we currently have for describing an executable solution.
Introducing a new abstraction in order to move from CPU-level computation (integers, pointers) to a higher-level representation (HOFs, type systems) doesn't mean it's any less abstract with respect to the actual problem.
As for your question about avoiding computation, I think you took my statements too strongly. There are situations where you have to do computation. But I'll give you an example: it's easier for me to tell you to look at the page at http://www.kimbly.com/, than it is for me to describe to you the intricacies of the HTTP protocol. A URL is a simple noun, while a protocol is a long involved procedural description. When I said to "avoid computation" I didn't mean to ban it outright -- just to prefer techniques that are declarative and/or specific, rather than techniques that are procedural and/or indirect.
And for the record, I think non-side-effecting functional languages are still procedural and indirect -- often even more so than imperative languages.
The language I designed does in fact look very much like data structures described at compile time. It's halfway between a configuration file and a Python script. Here's an example. This describes an instantiation of our indexer, which takes some input files, indexes them, and writes some output files.
wine_dgidx : Dgidx
input = data/post_forge/wine
output = data/dgraph_input/wine
The output from the indexer is used by an online server, described thusly:
wine_dgraph : Dgraph
log_dir = logs/
input = data/run/wine
port = 8000
To restart a server, you use a Script:
runme : Script
wine_dgidx
if wine_dgraph.running
wine_dgraph.stop
rotate_data
wine_dgraph.start
There are other features to the language, such as single-assign variables, repetition, error handling, and an escape hatch for tasks that aren't built in (you can either fall back to using the command prompt, or you can embed Perl code).
When I was designing the language, I would have preferred to create it as a domain-specific language embedded in Scheme. But that wasn't an option.
Posted by: kim at August 28, 2003 01:46 PMI'm looking for an excuse... er evaluating the value of a DSL for one or more domains. This thread just came near some questions I've been wanting to ask.
"When I was designing the language, I would have preferred to create it as a domain-specific language embedded in Scheme."
I'd like to hear more about the language.
What did you use as an implementation language?
What was the problem you solved by going with a DSL over, say Perl scripts?
How did you implement the language: By hand? Flex/Bison? JavaCC?
How complex is the grammar (LL - LALR)?
Nosy questions so just say, "I can't tell you." if it's secret. But I would be interested in any more you could say. Maybe this deserves it's own post?
Posted by: Bill Glover at August 28, 2003 03:40 PM(Sorry the first post was so long. It's a real art to write clearly and succinctly, and I've a long way to go ...)
"the only problem domain that I know of ... is compilers. If you are forced to use any of these things ..."
Higher-order functions are an elegant way of creating domain-specific languages. Creating type-safe collections (trees, maps, and other homogenous data strcutures) is easier with parametric polymorphism. It's a shame I can't think of compelling uses for inclusion polymorphism (subtyping) but I'm sure they exist.
"Introducing a new abstraction in order to move from CPU-level computation (integers, pointers) to a higher-level representation (HOFs, type systems) doesn't mean it's any less abstract with respect to the actual problem."
Yes it does - that's exactly what it means. It's less abstract w.r.t. the problem domain because you're moving closer to the problem domain. You're able to model the problem domain in terms that are higher-level - less detailed - than before. That's almost always an advantage. For example, if you have a good type system then the types you define can model directly the concepts in your problem domain, and the compiler can check whether or not the things you're doing to them are sane or not.
In "avoiding computation" I didn't understand your URL example. You want me to look at a URL, so I use something (a library or browser) which gives me a simple interface with with to fetch the resource at the URL. Using this something (library, browser) means I don't need to know the underlying details (http protocol) about how the resource is fetched. That is simply abstraction.
And I still don't understand your definition of indirection; I don't understand why data structures are a form of indirection. And what is "the price of indirection"?
You're moving closer to the problem domain. You're able to model the problem domain in terms that are higher-level - less detailed - than before.
I think this is the crux of our miscommunication. You're pointing out that abstractions help you model the problem domain, but it's precisely the process of creating a model that I'm objecting to. I think we should strive to make modeling unnecessary.
I think you're coming at this from the perspective of someone writing reusable libraries -- from that perspective, you want to have as many abstractions as you need in order to model your domain succinctly and precisely. Meanwhile I'm coming at it from the perspective of the actual language user -- the code grunt working on J2EE projects with 200 other programmers, or the hacker who wants to slap together a script to connect his cell phone and his web site.
My gripe is that in languages like Java, library writers are forced to phrase their "model" in terms of classes -- there really are no other ways in which a solution can be expressed. Haskell isn't much better, forcing you to express your solution in terms of functions and types. What haskell does have going for it is extensible infix syntax, which allows you to separate the syntax from the semantic model. Plus its higher-order features let your semantic model be arbitrarily complex.
As for the example of the URL, I'll try again. I have so much to write about it, that I'll make it a whole post on its own.
Here's a paper titled "Usability Issues in the Design of Novice Programming Systems". One of the suggestions is "Support Direct Manipulation and Definition by Example", which seems very related to the theme of this post.
http://www-2.cs.cmu.edu/~pane/cmu-cs-96-132.html
Posted by: kim at November 11, 2003 06:48 PMFrom the creator of the "XL" language, a short blurb on "Concept Programming". I think he and I are aiming at the same idea. Specifically, he uses the word "abstract" in the same way as I do.
The point of concept programming is to highlight that abstractions are always destructive. There is always something from the concept that doesn't appear in its code representation. Understanding this, and trying to minimize the differences, is key to writing good code.
LtU has a thread discussing concept programming.
Posted by: kim at January 19, 2004 06:38 PMThat link to the LtU dicussion of concept programming has changed to:
http://lambda-the-ultimate.org/classic/message10694.html
Posted by: John Eikenberry at February 9, 2005 01:20 PM