Programming Language Design « Vincent Gable’s Blog

December 10, 2009

Being a Lisp is a Handicap

Filed under: Accessibility,Programming | Functional Programming, Lisp, Programming Language Design, Readability, Syntax
― Vincent Gable on December 10, 2009

Being a Lisp Is a Handicap

There are a large number of people who find Lisp code hard to read. I’m one of them. I’m fully prepared to admit that this is a shortcoming in myself not Lisp, but I think the shortcoming is widely shared.

Perhaps if I’d learned Lisp before plunging into the procedural mainstream, I wouldn’t have this problem — but it’s not clear the results of MIT’s decades-long experiment in doing so would support that hypothesis.

I think it’s worse than that. In school, we all learn
3 + 4 = 7 and then
sin(?/2) = 1
and then many of us speak languages with infix verbs. So Lisp is fighting uphill.

It also may be the case that there’s something about some human minds that has trouble with thinking about data list-at-a-time rather than item-at-a-time …
…

I think I really totally understand the value of being homoiconic, and the awesome power of macros, and the notion of the reader. I want to like Lisp; but I think readability is an insanely important characteristic in programming systems.

Practically speaking, this means that it’d be hard for me to go out there on Sun’s (or Oracle’s) behalf and tell them that the way to take the best advantage of modern many-core hardware is to start with S-Expressions before breakfast.

—Tim Bray (emphasis mine)

I’m afraid he’s on to something. We have an amazing ability to parse language. But people aren’t terribly good at building the kinds of stacks needed to parse LISP with their short term memory.

This is the cheese that the rat that the cat that the dog that the neighbor owned bothered chased ate.

Say what?!

(This is the cheese (that the rat (that the cat (that the dog (that the neighbor owned) bothered) chased) ate)).

See the LISP connection?

All functional languages are fighting an uphill battle to be understood. The world we evolved in is stateful (modal) and imperative. We navigate it in a me-at-a-time way. Unfortunately, LISP’s prefix syntax is another, unnecessary, barrier.

The bottom line is that every word of code spends more time being read than written — so writing in a syntax that most people have a hard time reading is one of the worst programming choices imaginable. I believe functional programming languages are well worth learning; but I don’t believe it’s worth suffering a poor syntax.

Comments (3)

October 20, 2009

JavaScript Nailed ||

Filed under: C++,Design,Programming,Usability | C++, Functional Programming, JavaScript, Look At My Stupid Factorial Function, Programming Language Design
― Vincent Gable on October 20, 2009

One thing about JavaScript I really like is that its ||, the Logical Or operator, is really a more general ‘Eval Until True‘ operation. (If you have a better name for this operation, please leave a comment!) It’s the same kind of or operator used in Lisp. And I believe it’s the best choice for a language to use.

In C/C++, a || b is equivalent to,

  if a evaluates to a non-zero value:
    return true;
  if b evaluates to a non-zero value:
    return true;
  otherwise:
    return false;

Note that if a can be converted to true, then b is not evaluated. Importantly, in C/C++ || always returns a bool.

But the JavaScript || returns the value of the first variable that can be converted to true, or the last variable if both variables can’t be interpreted as true,

  if a evaluates to a non-zero value:
    return a;
  otherwise:
    return b;

Concise

JavaScript’s || is some sweet syntactic sugar.

We can write,

return playerName || "Player 1";

instead of,

return playerName ? playerName : "Player 1";

And simplify assert-like code in a perl-esq way,

x || throw "x was unexpectedly null!";

It’s interesting that a more concise definition of || allows more concise code, even though intuitively we’d expect a more complex || to “do more work for us”.

General

Defining || to return values, not true/false, is much more useful for functional programming.

The short-circuit-evaluation is powerful enough to replace if-statements. For example, the familiar factorial function,

function factorial(n){
	if(n == 0) return 1;
	return n*factorial(n-1);
}

can be written in JavaScript using && and || expressions,

function factorial2(n){ return n * (n && factorial2(n-1)) || 1;}

Yes, I know this isn’t the clearest way to write a factorial, and it would still be an expression if it used ?:, but hopefully this gives you a sense of what short-circuiting operations can do.

Unlike ?:, the two-argument || intuitively generalizes to n arguments, equivalent to a1 || a2 || ... || an. This makes it even more useful for dealing with abstractions.

Logical operators that return values, instead of simply booleans, are more expressive and powerful, although at first they may not seem useful — especially coming from a language without them.

Comments (3)

October 16, 2009

Hack: Counting Variadic Arguments in C

Filed under: Design,Programming,Research,Sample Code,Usability | C++, Introspection, Parsing, Programming Language Design, Variadic Functions, Variadic Macros
― Vincent Gable on October 16, 2009

This isn’t practical, but I think it’s neat that it’s doable in C99. The implementation I present here is incomplete and for illustrative purposes only.

Background

C’s implementation of variadic functions (functions that take a variable-number of arguments) is characteristically bare-bones. Even though the compiler knows the number, and type, of all arguments passed to variadic functions; there isn’t a mechanism for the function to get this information from the compiler. Instead, programmers need to pass an extra argument, like the printf format-string, to tell the function “these are the arguments I gave you”. This has worked for over 37 years. But it’s clunky — you have to write the same information twice, once for the compiler and again to tell the function what you told the compiler.

Inspecting Arguments in C

Argument Type

I don’t know of a way to find the type of the Nth argument to a varadic function, called with heterogeneous types. If you can figure out a way, I’d love to know. The typeof extension is often sufficient to write generic code that works when every argument has the same type. (C++ templates also solve this problem if we step outside of C-proper.)

Argument Count (The Good Stuff Starts Here)

By using variadic macros, and stringification (#), we can actually pass a function the literal string of its argument list from the source code — which it can parse to determine how many arguments it was given.

For example, say f() is a variadic function. We create a variadic wrapper macro, F() and call it like so in our source code,

x = F(a,b,c);

The preprocessor expands this to,

x = f("a,b,c",a,b,c)

Or perhaps,

x = f(count_arguments("a,b,c"),a,b,c)

where count_arguments(char *s) returns the number of arguments in the string source-code string s. (Technically s would be an argument-expression-list).

Example Code

Here’s an implementation for, iArray(), an array-builder for int values, very much like JavaScript‘s Array() constructor. Unlike the quirky JavaScript Array(), iArray(3) returns an array containing just the element 3, [3], not an uninitilized array with 3 elements, [undefined, undefined, undefined]. Another difference: iArray(), invoked with no arguments, is invalid, and will not compile.

#define iArray(...) alloc_ints(count_arguments(#__VA_ARGS__), __VA_ARGS__)

This macro is pretty straightforward. It’s given a variable number of arguments, represented by __VA_ARGS__ in the expansion. #__VA_ARGS__ turns the code into a string so that count_arguments can analyze it. (If you were doing this for real, you should use two levels of stringification though, otherwise macros won’t be fully expanded. I choose to keep things “demo-simple” here.)

unsigned count_arguments(char *s){
	unsigned i,argc = 1;
		for(i = 0; s[i]; i++)
			if(s[i] == ',')
				argc++;
	return argc;
}

This is a dangerously naive implementation and only works correctly when iArray() is given a straightforward non-empty list of values or variables. Basically it’s the least code I could write to make a working demo.

Since iArray must have at least one argument to compile, we just count the commas in the argument-list to see how many other arguments were passed. Simple to code, but it fails for more complex expressions like f(a,g(b,c)).

int *alloc_ints(unsigned count, ...){
	unsigned i = 0;
	int *ints = malloc(sizeof(int) * count);
	va_list args;
    va_start(args, count);
	for(i = 0; i < count; i++)
		ints[i] = va_arg(args,int);
	va_end(args);
	return ints;
}

Just as you'd expect, this code allocates enough memory to hold count ints, and fills it with the remaining count arguments. Bad things happen if < count arguments are passed, or they are the wrong type.

Download the code, if you like.

Parsing is Hard, Let's Go Shopping

I didn't even try to correctly parse any valid argument-expression-list in count_arguments. It's non trivial. I'd rather deal with choosing the correct MAX3 or MAX4 macro in a few places than maintain such a code base.

So this kind of introspection isn't really practical in C. But it's neat that it can be done, without any tinkering with the compiler or language.

Comments (0)

September 11, 2009

Never Start An Integer With 0

Filed under: Bug Bite,C++,Objective-C,Programming | Haskell, Java, JavaScript, Lisp, Octal, Programming Language Design, Programming Style, Python, Ruby, Smalltalk
― Vincent Gable on September 11, 2009

When programming, never start an integer with 0. Most programming languages treat a decimal number that starts with 0 as octal (base-8). So x = 013; does not set x to 13. Instead x is 11, because 013 is interpreted as 13₈ not 13₁₀.

Languages with this quirk include: C, C++, Objective-C, Java, JavaScript, Perl, Python 3.0, and Ruby. If you add up the “market share” of these languages, it comes out to above 50%, which is why I say most languages.

“But I use {Smalltalk, Haskell, Lisp, etc.}”

I’m jealous that you get to use such a nice language. However, it’s bad programming hygiene to pick up habits that are dangerous in common languages.

Now, I assume you wouldn’t write 7 as 007, unless the leading zero(s) carried some extra meaning. There are cases where this clarity outweighs “cleanliness” (unless the code meant to be ported to a C-like language).

But you should at least be aware of this inter-lingual gotcha.

Comments (0)

May 15, 2009

Concise NSDictionary and NSArray Lookup

Filed under: Objective-C,Programming,Research,Sample Code,Usability | NSArray, NSDictionary, Programming Language Design
― Vincent Gable on May 15, 2009

I started writing a list of ways I thought Objective-C could be improved, and I realized that many of my wishes involved more compact syntax. For example [array objectAtIndex:1] is so verbose I think it diminishes readability, compared to array[1].

I can’t quite match that brevity (can you, by using Objective-C++?), but with a one-line category, you can say, x = [array:1];.

@interface NSArray (ConciseLookup)
- (id):(NSUInteger)index;
@end
@implementation NSArray (ConciseLookup)
- (id):(NSUInteger)index;
{
	return [self objectAtIndex:index];
}
@end

My question is: do you find this compact “syntax” useful at all, or is it added complexity with no substantial code compression? Personally I think the latter, but the number of wishes I had involving more concise Objective-C syntax makes me wonder…

Comments (2)

May 14, 2009

Emergent Libraries

Filed under: Programming,Research | C++, Functional Programming, LLVM, Programming Language Design, Programming Style, Software Development, Static Analysis
― Vincent Gable on May 14, 2009

I have latched onto an idea, but don’t have the resources to follow up on it: could a static-analysis tool identify repeated patterns of code, across many code bases, that should be extracted out as subroutines and higher-level functions? How universal would these “emergent libraries” be?

My inspiration here is Section 4.1 Identifying Common Functions, in the excellent paper Some Thoughts on Security After Ten Years of qmail 1.0 (PDF), by Daniel J. Bernstein,

Most programmers would never bother to create such a small function. But several words of code are saved whenever one occurrence of the dup2()/close() pattern is replaced with one call to fd_move(); replacing a dozen occurrences saves considerably more code than were spent writing the function itself. (The function is also a natural target for tests.) The same benefit scales to larger systems and to a huge variety of functions; fd_move() is just one example. In many cases an automated scan for common operation sequences can suggest helpful new functions, but even without automation I frequently find myself thinking “Haven’t I seen this before?” and extracting a new function out of existing code.

What’s particularly fascinating to me are the new operations we might find.

Before I was exposed to the Haskell prelude I hadn’t known about the fundamentally useful foldl and foldr operations. I had written dozens of programs that used accumulation, but it’s generalization hadn’t occurred to me — and probably never would have. Static analysis can help uncover generalizations that we might have missed, or didn’t think were important, but turn out in practice to be widely used operations.

Comments (0)