Feh

Bellevue Square is going to expand.

Betcha they still don’t have a decent bookstore by 2010.  (I suppose I’d rather have the U of WA branch bookstore survive nearby, though, even if it’s a shadow of the parent store).

 

C is for Cuneiform

We’re still punching holes in things to make programs.  Yes, we are.  All these fancy editors and we’re still baking code in clay.  Most code is syntax on wax tablets, dead bits about as accessible to your average program as heiroglyphics sunk under the sand.  Such a waste, it doesn’t have to be this way.

Steve Yegge has an interesting (if long) bit on what the Next Big Language is going to be. Go ahead and read it (don’t miss the first few comments). I’ll wait.

He’s absolutely correct about both the difficulty of parsing a “modern” language, and the requirement that any Next Big Language is going to have to support more or less the same syntax because people are used to it. But parsing C is hard, and parsing C++ is hell-on-roller-skates.  That’s why people pay through the nose for competent tools that work with these languages; the barrier to get any kind of meta-tool working at the semantic level (what I mean is: above the level of grep) is really high. When you add vendor-specific extensions (e.g., “The Frobozz Quality Suite works great for Linux, the Mac and Windows, but we need it to run on stuff written for the Xbox-360 and the Playstation 3”) you’re basically just doomed.

My opinion: The Next Big Language needs to be essentially a DOM. Call it reflection that you can use to write code with, call it a parse tree with semantic contracts and a run-time, call it whatever you want, but having programs easily able to read and write other programs is just too powerful a paradigm to keep ignoring.

I hardly care if the language is LALR(N), reverse polish, prefix, Unicode-based or has continuation characters in column 6. As long as I have an editor that can speak to a capable abstraction that in turn *writes* in that language, I honest don’t give a damn what the language looks like; I’ll make it look like whatever I’m comfortable with at the moment.  Why should there be only one way to look at an entire hunk of source code?

So when we standardize the print-rep of a language, what are we really doing? Simply making it easier to publish textbooks about the language? Making it so that we can still write programs with strips of paper and an awl? Why do we bother with a text-rep of a language any more?  Let the audience decide where to put the semicolons for a change.  Let’s get out of the Emacs and vi ghettos and actually use these bitmap displays for something other than fancy keypunching.  (Remember, however, the cautionary tale of the UCSD Pascal editor of the early 80s, which forced you into syntax and basically sucked dead exploding goats).

I’m begging some important questions, namely all the semantics (though if you stray too far from C you’re in trouble on bare-metal platforms and other places where cycles matter), and what the heck do we do with compilation units (which are, after all, a reflection on compiler technology, the miserable state of our disk I/O bandwidth, and our seeming inability to make efficient use of main memory that is running $100 a gigabyte these days, about an hour’s salary for a good engineer).  What if the floor for a dev box was 64G of RAM and $5000 worth of really good hard disk array?  (Or better yet, a decent build system in the basement?)

More nattering later…

[Yes, C# has a not very nice code DOM, maybe improved since the last time I looked.  LISP has had something like this forever, as long as you like LISP.  With the JVM and CLR you can pick your own language rep as long as you like assembly language, which is not very interesting.]

froot bar gag

A TV news anchor walks into a bar with a basket full of fruit, some glue and a wad of wool. He sits down, orders a drink, and starts to attach little tufts of wool to the tops and little feet to the bottoms of his apples, oranges, pears and lemons.

The bartender is curious. “What’re you doing?”

The anchor says, a little embarrassed, “It’s for my son’s school project. He needs a crowd scene for his diorama, and I thought that I could liven it up a bit by making people out of something handy. You know, feet, eyes, mouths, some good hair on these bald little guys…”

The bartender has a flash of recognition. “Hey, aren’t you the guy on the broadcast –”

The anchor says, “No, right now I’m a pear toupeer.”

FORTH

I only know two good FORTH programmers.

One is Chuck Moore.  I met him once (he was doing chip design for Toshiba or something at the time).  Interesting, smart guy.

The other one wrote a bunch of Finnegan’s control software (for their line of mass spectrometers).  He was pragmatic about the language and its limitations.

Every single other use of FORTH that I have seen has been a disaster.  The unfortunate fact is that the people attracted to this language seem to be the ones least able to write good solid code in it.  Let’s see: Several disasterous attempts at video games at Atari, some early “hardware bringup” utilities (that didn’t catch some low-level timing problems because FORTH is nowhere close to native speed, despite what you might hear to the contrary), some other projects that I have mercifully forgotten.

With that said, I think that everyone should write a variant of FORTH at some point in their career; it’s simple to implement, and has some interesting and quite practical ideas for working in a primitive environment (e.g., “pages” instead of a complex file system, a ubiquitous built-in assembler, a scoping system that at first seems too simple to possibly work).  It’s a slick system as long as you don’t try to misapply it.  Writing your own version is an easy way to get it out of your system.

(I really will have a response to the “Why are we still using C” question.  You’re not going to like it 🙂 )

 

Why do we still use C?

[slightly edited from a private flame yesterday]

If you have ACM access to the PLOS 2006 proceedings, I recommend Jonathon Shapiro’s paper _Programming Language Challenges in System Codes_ (Why Systems Programmers Still Use C, and What To Do About It) — it doesn’t appear to be reprinted anywhere else, so I’ll snip a bit of it:

Systems Programming . . . [is] fundamentally about engineering rather than programming languages.  In the 1980s . . . engineering considerations were required criteria for language and compiler design.  By the time I left the PL community in 1990, respect for engineering and pragmatics was fast fading, and today it is all but gone.  The concrete syntax of Standard ML and Haskell are every bit as bad as C++.  It is a curious measure of the programming language community that nobody cares. 

He gives some examples, notably a TCP stack in SML that, while a notable achievement, is at best 11 times slower than the native stack on the same architecture (and worst case performance is unbelievably miserable).  He makes a point that going from a factor of 1.5 to 2 in performance is not negligible when this translates directly to the number of servers that you have to buy, power, cool and maintain.

Frankly, I’m sick of C (and the safe subsets of C++ that are basically sugared C), but it’s the only damned tool around that will do the job.  Pick another language that you can reasonably implement an OS in, what are the choices?  Oberon, Modula, extended Pascal . . . Ada, BLISS?  These are all essentially the same animal with their semicolons in different places.  Java, C#?  No resources on the platform for these.  LISP?  Worse and worse.

On something new: I don’t have time to get a PhD in type systems.  I need a debugger that works.  I still need to write assembly language and really worry about the number of cycles involved in some operations.  But I’m tired of writing bugs and races and want something lots *safer* than C, and everything out there now is mired in a decade of theory that doesn’t help systems programming at all; even if I spent the time to learn and move in with it, it wouldn’t help.

Shapiro’s solution (BitCC) is interesting.  I won’t go into it here.

Someone I read recently had the idea of putting the whitespace and formatting into the grammar; like Python (say), but with no semantics (e.g., no indentation-equals-a-scope), only checking.  At least you avoid “one-true-brace-style” wars that way.  But this is the least of our problems.

Maybe if we took away these PL theorists’ Emacs and LaTeX packages for a while we’d get better results.  Threaten to take away their fast TCP stacks and graphical interfaces.  It’s time for the programming language community to do something for the systems programming people, something that actually works well on bare metal.