Vincent Gable’s Blog

March 7, 2009

Don’t Work Against Yourself

Filed under: Quotes | , , ,
― Vincent Gable on March 7, 2009

Reaganite conservatism axiomatically disdains government, and that creates a perverse incentive for conservative politicians to run government badly (or at least not to run it well), since the failure of government confirms conservative prejudices and (in theory) provides the movement with additional evidence in favor of its ideology. We just saw a particularly vivid example of this pathologically self-destructive dynamic at work in Bobby Jindal’s otherwise inexplicable attempt to turn the Bush administration’s utter ineptitude after Hurricane Katrina into a GOP talking point.

Damon Linker

I generally try to keep politics out of my blog, because political discussion on the internet isn’t productive. But I think there is a good lesson in this bit of history, and it’s very applicable to software development.

March 6, 2009

A Good Basic Computer Science Book

Filed under: Announcement,Programming,Tips | , , , ,
― Vincent Gable on March 6, 2009

In high school, I literally wore out my copy of The New Turing Omnibus: Sixty-Six Excursions in Computer Science. The graphics topics are dated, and there is no discussion about newfangled topics in computing, like the internet. But the meat of the book are timeless computer science fundamentals. It still has the best explanation of what “NP-Complete” means (page 276) that I’ve run across.

The book covers some dense territory, but is still fairly accessible. When my mother asked me, “how can a computer make a random number, if it only does what it’s told?” I pointed her to the chapter 8, “RANDOM NUMBERS: The Chaitin-Kolmogoroff Theory” (page 49). The math was a bit over her head, but she could still read the chapter, and it answered her question. I recommend it to The New Turing Omnibus, without reservation, to anyone who’s considering Computer Science.

What are your favorite introductory Computer Science books?

Patrick Thomson suggests Godel, Escher, Bach: An Eternal Golden Braid. It’s an excellent and fun introduction to the essential theory behind computer science.

Here is an excellent overview of the current state of the P=NP question.

March 4, 2009

Professionals Check What They Broke

Filed under: Quotes |
― Vincent Gable on March 4, 2009

Mark Chu-Carroll, on how to tell if someone defining division-by-zero is a competent mathematician,

One good way of recognizing a crank is by looking at what they do with their new division-by-zero defining system. A serious mathematician starts working out what affect their definition has on the basic axioms, and what still works. A crank defines division by zero, and then proceeds to continue working as if they haven’t broken anything.

Mark Chu-Carroll

There are parallels in most every profession.

Retest Your (Low Level) Optimizations

Filed under: Programming,Quotes | , ,
― Vincent Gable on March 4, 2009

(Measuring before and after applying an optimization is) more important these days. With optimizing compilers and smart virtual machines, many of the standard optimizing techniques are not just ineffective but also counterintuitive. Craig Larman really brought this home when he told me about some comments he received after a talk at JavaOne about optimization in Java. One builder of an optimizing virtual machine said, in effect,

“The comments about thread pools were good, but you shouldn’t use object pools because they will slow down our VM.”

Then another VM builder said,

“The comments about object pools were good, but you shouldn’t use thread pools because they slow down our VM.”

Not only does this reinforce the need to measure with every optimization change, it also suggests that you should log every change made for optimization (a comment tag in the source code is a good option) and retest your optimizations after upgrading your compiler or VM. The optimization you did six months ago could be your bottleneck today.

Martin Fowler (PDF), 2002

I’ve written before about the decay of machine-specific optimization. Even if your code isn’t run by a VM, I think it’s reasonable to expect that (at least some of it) might be run on a smartphone in the near future.

March 3, 2009

Vincent’s Notes: End To End Arguments in System Design

Filed under: Programming | , , ,
― Vincent Gable on March 3, 2009

At Michael Tsai’s suggestion I listened to the paper End-To-End Arugments in System Design while driving. (Fair warning: Since I was also driving while listening, I didn’t absorb everything as well as I should have.)

The thrust of the paper is that you generally want to make your low level components (aka libraries) simpler then you think. Counter-intuitively, building extra reliability into a low-level component does not (usually) make it easier to build a reliable application that uses the component. That’s because the application has to work around all sorts of other errors from different components. So it must have error handling code. Making one low level component “smarter” does not change this. But it does make the component more complex. And some of that complexity is duplicate code that does just what the application’s error handling code does.

The “End to End” in the title of the paper is from a file transfer application having to do an “end to end” check to make sure that the files at both end of the transmission are the same.

Conclusions

End-to-end arguments are a kind of “Occam’s razor” when it comes to
choosing the functions to be provided in a communication subsystem.
Because the communication subsystem is frequently specified before
applications that use the subsystem are known, the designer may be
tempted to “help” the users by taking on more function than necessary.
Awareness of end-to-end arguments can help to reduce such temptations….

March 2, 2009

Initial Findings: How Long is an (English) Word?

Filed under: Research | , , ,
― Vincent Gable on March 2, 2009

My brief research into the English language revealed the average character count of a word is eight. Throw together a bunch of a smaller and bigger words, some single spaces and punctuation and you roughly end up with the average 140-character tweet being somewhere between 14 and 20 words. Let’s call it 15.

Rands in Repose

That contradicts the common wisdom I’ve heard: the average word is 5 letters, so divide your character count by 6 to get a word count.

But that was a rule of thumb from the days of typewriters. Hypertext and formatting changes things. For example, every time you see something in boldface on my blog, there are an extra 17 characters for the HTML code, <strong></strong>, that makes the text bold.

Just to poke at the problem, I used wc to find the number of characters per word in a few documents. What I found supports the 6 characters per word rule of thumb for content, but not for HTML code. The number of characters per word in HTML was higher then 6, and varied greatly.

The text of the front page article on today’s New York Times was 5880 characters, 960 words: 6 characters per word.

The plain text of Rand’s webpage claiming 15 chars per word was 6794 characters, 1175 words: 6 words per character. By plain text, I mean just the words of the HTML after it was rendered, so formatting, images, links, etc were ignored. The HTML source for the page, however, was 15952 characters, meaning 14 words per character.

What about technical stuff? The best paper I read last year was Some thoughts on security after ten years of qmail 1.0 (PDF). It has no pictures, just 9517 formatted words. A PDF represents it with 161496 bytes (17 bytes per word), but ignoring formatting it is 62567 characters (7 characters per word).

I’m still looking into how long English words are in practice. Please share your research, if you have an opinion.

« Newer Posts

Powered by WordPress