Vincent Gable’s Blog

August 19, 2010

The Most Useful Objective-C Code I’ve Ever Written

Actually, it’s the most useful code I’ve extended; credit for the core idea goes to Dave Dribin with his Handy NSString Conversion Macro.

LOG_EXPR(x) is a macro that prints out x, no matter what type x is, without having to worry about format-strings (and related crashes from eg. printing a C-string the same way as an NSString). It works on Mac OS X and iOS. Here are some examples,

LOG_EXPR(self.window.screen);

self.window.screen = <UIScreen: 0x6d20780; bounds = {{0, 0}, {320, 480}}; mode = <UIScreenMode: 0x6d20c50; size = 320.000000 x 480.000000>>

LOG_EXPR(self.tabBarController.viewControllers);

self.tabBarController.viewControllers = (
“<UINavigationController: 0xcd02e00>”,
“<SavingsViewController: 0xcd05c40>”,
“<SettingsViewController: 0xcd05e90>”
)

Pretty straightforward, really. The biggest convenience so far is having the expression printed out, so you don’t have to write out a name redundantly in the format string (eg. NSLog(@"actionURL = %@", actionURL)). But LOG_EXPR really shows it’s worth when you start using scalar or struct expressions:

LOG_EXPR(self.window.windowLevel);

self.window.windowLevel = 0.000000

LOG_EXPR(self.window.frame.size);

self.window.frame.size = {320, 480}

Yes, there are expressions that won’t work, but they’re pretty rare for me. I use LOG_EXPR every day. Several times. It’s not quite as good as having a REPL for Cocoa, but it’s handy.

Give it a try.

How It Works

The problem is how to pick a function or format string to print x, based on the type of x. C++’s type-based dispatch would be a good fit here, but it’s verbose (a full function-definition per type) and I wanted to use pure Objective-C if possible. Fortunately, Objective-C has an @encode() compiler directive that returns a string describing any type it’s given. Unfortunately it works on types, not variables, but with C99 the typeof() compiler directive lets us get the type of any variable, which we can pass to @encode(). The final bit of compiler magic is using stringification (#) to print out the literal string inside LOG_EXPR()‘s parenthesis.

The Macro, Line By Line

1 #define LOG_EXPR(_X_) do{\
2 	__typeof__(_X_) _Y_ = (_X_);\
3 	const char * _TYPE_CODE_ = @encode(__typeof__(_X_));\
4 	NSString *_STR_ = VTPG_DDToStringFromTypeAndValue(_TYPE_CODE_, &_Y_);\
5 	if(_STR_)\
6 		NSLog(@"%s = %@", #_X_, _STR_);\
7 	else\
8 		NSLog(@"Unknown _TYPE_CODE_: %s for expression %s in function %s, file %s, line %d", _TYPE_CODE_, #_X_, __func__, __FILE__, __LINE__);\
9 }while(0)
  1. The first and last lines are a way to put {}‘s around the macro to prevent unintended effects. The do{}while(0); “loop” does nothing else.
  2. First evaluate the expression, _X_, given to LOG_EXPR once, and store the result in a _Y_. We need to use typeof() (which had to be written __typeof__() to appease some versions of GCC) to figure out the type of _Y_.
  3. _TYPE_CODE_ is c-string that describes the type of the expression we want to print out.
  4. Now we have enough information to call a function, VTPG_DDToStringFromTypeAndValue() to convert the expression’s value to a string. We pass it the _TYPE_CODE_ string, and the address of _Y_, which is a pointer, and has a known size. We can’t pass _Y_ directly, because depending on what _X_ is, it will have different types and could be of any size.
  5. VTPG_DDToStringFromTypeAndValue() returns nil if it can’t figure out how to convert a value to a string.
  6. Everything went well, print the stringified expression, #_X_, and the string representing it’s value, _STR_.
  7. otherwise…
  8. The expression had a type we can’t handle, print out a verbose diagnostic message.
  9. See line 1.

The VTPG_DDToStringFromTypeAndValue() Function

See the source in VTPG_Common.m:

It’s derived from Dave Dribin‘s function DDToStringFromTypeAndValue(), and is pretty straightforward: strcmp() the type-string, and if it matches a known type call a function, or use +[NSString stringWithFormat]:, to turn the value into a string.

The First Step Twords Fixing Your Macro Problem is Admitting it…

So yeah, maybe I went a little wild with macros here…

But it took out some WET-ness of the original code, and prevents me from accidentally mixing up types in a long wall of ifs, eg.

else if (strcmp(typeCode, @encode(NSRect)) == 0)
{
    return NSStringFromRect(*(NSRange *)value);
}
else if (strcmp(typeCode, @encode(NSRange)) == 0)
{
    return NSStringFromRect(*(NSRange *)value);
}

If I were cool, I’d use NSDictionarys to map from the @encode-string to an appropriate format string or function pointer. This is conceptually cleaner; less error-prone than using macros; and almost certainly faster. Unfortunately, it gets a little tricky with functions, since I need to deference value into the proper type.

One final note from my testing, I could do away with the strcmp()s, because directly comparing @encode string pointers (eg if(typeCode == @encode(NSString*)) works. I don’t know if it will always work though, so relying on it strikes me as a profoundly Bad Idea. But maybe that bad idea will give someone a good idea.

Limitations

Arrays

C arrays generally muck things up. Casting to a pointer works around this:

char x[14] = "Hello, world!";
//LOG_EXPR(x); //error: invalid initializer
LOG_EXPR((char*)x); //prints fine

__func__

Because it is a static const char [], __func__ (and __FUNCTION__ or __PRETTY_FUNCTION__) need casting to char* to work with LOG_EXPR. Because logging out a function/method call is something I do frequently, I use the macro:

#define LOG_FUNCTION()	NSLog(@"%s", __func__)

long double (Leopard and older)

On older systems, LOG_EXPR won’t work with a long double value, because @encode(long double) gives the same result as @encode(double). This is a known issue with the runtime. The top-level LOG_EXPR macro could detect a long double with if((sizeof(_X_) == sizeof(long double)) && (_TYPE_CODE_ == @encode(double))). But I doubt this will ever be necessary.

I haven’t actually written any code that uses long double, because I use NSDecimal, or another base-10 number format, for situations that require more precision than a double.

Scaling and Frameworks

Growing LOG_EXPR to handle every type is a lot of work. I’ve only added types that I’ve actually needed to print. This has kept the code manageable, and seems to be working so far.

The biggest problem I have is how to deal with types that are in frameworks that not every project includes. Projects that use CoreLocation.framework need to be able to use LOG_EXPR to print out CoreLocation specific structs, like CLLocationCoordinate2D. But projects that don’t use CoreLocation.framework don’t have a definition of the CLLocationCoordinate2D type, so code to convert it to a string won’t compile. There are two ways I’ve tried to solve the problem

Comment-out framework-specific code

This is pretty self-explanatory, I’ll fork VTPG_Common.m and un-comment-out code for types that my project needs to print. It works, but it’s drudgery. Programmers hate that.

Hardcode type info

The idea is to hard-code the string that @encode(SomeType) would evaluate to, and then (since we know how SomeType is laid out in memory) use casting and pointer-arithmetic to get at the fields.

For example:

//This is a hack to print out CLLocationCoordinate2D, without needing to #import <CoreLocation/CoreLocation.h>
//A CLLocationCoordinate2D is a struct made up of 2 doubles.
//We detect it by hard-coding the result of @encode(CLLocationCoordinate2D).
//We get at the fields by treating it like an array of doubles, which it is identical to in memory.
if(strcmp(typeCode, "{?=dd}")==0)//@encode(CLLocationCoordinate2D)
	return [NSString stringWithFormat:@"{latitude=%g,longitude=%g}",((double*)value)[0],((double*)value)[1]];

This Just Works in a project that includes CoreLocation, and doesn’t mess up projects that don’t. Unfortunately it’s horribly brittle. Any Xcode or system update could break it. It’s not a tenable fix.

Areas for Improvement

If there’s some type LOG_EXPR can’t handle that you need, please jump right in and improve it!

When I have time, I plan to write a general parser for @encode()-strings. This will let me print out any struct, which mostly solves the type-defined-in-missing-framework problem, and would let LOG_EXPR Just Work with types from all kinds of POSIX/C libraries.

Using LOG_EXPR() in Your Project

Download VTPG_Common.m and VTPG_Common.h from my github repository, and add them to your Xcode project.

Now just add the line #import "VTPG_Common.h" to your prefix file (named <ProjectName>_Prefix.pch by default), after the #ifdef __OBJC__, for example:

#ifdef __OBJC__
    #import <Foundation/Foundation.h>
    // maybe other files, depending on project  template...
    #import "VTPG_Common.h"
#endif

Now LOG_EXPR() will work everywhere in your project.

May 24, 2010

Experts are Easier to Fool

Filed under: Quotes,Research,Security | , ,
― Vincent Gable on May 24, 2010

Another counter-intuitive finding is that scam victims often have better than average background knowledge in the area of the scam content. For example, it seems that people with experience of playing legitimate prize draws and lotteries are more likely to fall for a scam in this area than people with less knowledge and experience in this field. This also applies to those with some knowledge of investments. Such knowledge can increase rather than decrease the risk of becoming a victim.

(via Bruce Schneier)

November 9, 2009

Spurious

What’s a spurious relationship?

Here’s one: People who eat ice cream are more likely to drown. Both incidence of ice cream eating and rates of drowning are related to summertime. The relationship between ice cream and drowning is spurious. That is, there is no relationship. Yet they appear related because they are both related to a third variable.

Lisa Wade

untitled5sk.jpg

(Image via the amazing Superdickery)

October 16, 2009

Hack: Counting Variadic Arguments in C

This isn’t practical, but I think it’s neat that it’s doable in C99. The implementation I present here is incomplete and for illustrative purposes only.

Background

C’s implementation of variadic functions (functions that take a variable-number of arguments) is characteristically bare-bones. Even though the compiler knows the number, and type, of all arguments passed to variadic functions; there isn’t a mechanism for the function to get this information from the compiler. Instead, programmers need to pass an extra argument, like the printf format-string, to tell the function “these are the arguments I gave you”. This has worked for over 37 years. But it’s clunky — you have to write the same information twice, once for the compiler and again to tell the function what you told the compiler.

Inspecting Arguments in C

Argument Type

I don’t know of a way to find the type of the Nth argument to a varadic function, called with heterogeneous types. If you can figure out a way, I’d love to know. The typeof extension is often sufficient to write generic code that works when every argument has the same type. (C++ templates also solve this problem if we step outside of C-proper.)

Argument Count (The Good Stuff Starts Here)

By using variadic macros, and stringification (#), we can actually pass a function the literal string of its argument list from the source code — which it can parse to determine how many arguments it was given.

For example, say f() is a variadic function. We create a variadic wrapper macro, F() and call it like so in our source code,

x = F(a,b,c);

The preprocessor expands this to,

x = f("a,b,c",a,b,c)

Or perhaps,

x = f(count_arguments("a,b,c"),a,b,c)

where count_arguments(char *s) returns the number of arguments in the string source-code string s. (Technically s would be an argument-expression-list).

Example Code

Here’s an implementation for, iArray(), an array-builder for int values, very much like JavaScript‘s Array() constructor. Unlike the quirky JavaScript Array(), iArray(3) returns an array containing just the element 3, [3], not an uninitilized array with 3 elements, [undefined, undefined, undefined]. Another difference: iArray(), invoked with no arguments, is invalid, and will not compile.

#define iArray(...) alloc_ints(count_arguments(#__VA_ARGS__), __VA_ARGS__)

This macro is pretty straightforward. It’s given a variable number of arguments, represented by __VA_ARGS__ in the expansion. #__VA_ARGS__ turns the code into a string so that count_arguments can analyze it. (If you were doing this for real, you should use two levels of stringification though, otherwise macros won’t be fully expanded. I choose to keep things “demo-simple” here.)

unsigned count_arguments(char *s){
	unsigned i,argc = 1;
		for(i = 0; s[i]; i++)
			if(s[i] == ',')
				argc++;
	return argc;
}

This is a dangerously naive implementation and only works correctly when iArray() is given a straightforward non-empty list of values or variables. Basically it’s the least code I could write to make a working demo.

Since iArray must have at least one argument to compile, we just count the commas in the argument-list to see how many other arguments were passed. Simple to code, but it fails for more complex expressions like f(a,g(b,c)).

int *alloc_ints(unsigned count, ...){
	unsigned i = 0;
	int *ints = malloc(sizeof(int) * count);
	va_list args;
    va_start(args, count);
	for(i = 0; i < count; i++)
		ints[i] = va_arg(args,int);
	va_end(args);
	return ints;
}

Just as you'd expect, this code allocates enough memory to hold count ints, and fills it with the remaining count arguments. Bad things happen if < count arguments are passed, or they are the wrong type.

Download the code, if you like.

Parsing is Hard, Let's Go Shopping

I didn't even try to correctly parse any valid argument-expression-list in count_arguments. It's non trivial. I'd rather deal with choosing the correct MAX3 or MAX4 macro in a few places than maintain such a code base.

So this kind of introspection isn't really practical in C. But it's neat that it can be done, without any tinkering with the compiler or language.

September 17, 2009

Big Freaking Systems

Filed under: Programming,Quotes,Research | , , , ,
― Vincent Gable on September 17, 2009

A programming language is a tool for handling design complexity. That’s what all of computer science is, really — languages, libraries, type systems, garbage collectors, everything you learn about programming. They’re ways to build more and more complex designs without losing your grip.

The way you manage complexity is to be able to ignore it. A good programming tool lets you forget about some part of the problem, so that you can focus on some other part. And it ensures that when you return to the parts you forgot, you haven’t accidentally broken them.

Andrew Potkin

Years ago, When I was taking to programmers about what college I wanted to attend, I had in interesting conversation about how Computer Science education is an utter failure at preparing students for real-world programming. Outside of Software Development, no technical field accepts (sometimes prefers) candidates with “N years of experience” in place of a degree. I’m not sure I know why CS education fails so badly and universally. But my current best guess is that it’s because school never exposes you to enough complexity. Projects have to end in a semester. You never have to deal with a multimillion-line program, written by hundreds of co-workers, dozens of which you need to collaborate with, at unexpected times, for surprising reasons.

June 4, 2009

links for 2009-06-04

Filed under: Announcement,iPhone,Programming,Quotes,Research,Usability
― Vincent Gable on June 4, 2009

This was an experiment, in doing more with my delicious bookmarks. I was hoping that I could get more feedback and discussion on things I found interesting enough to bookmark by automatically posting links to them here. Many sites that I enjoy reading do something similar. But it hasn’t felt like a good fit for me.

June 1, 2009

Pass Phrases, Not Passwords

Filed under: Accessibility,Research,Security,Usability | , , , ,
― Vincent Gable on June 1, 2009

Thomas Baekdal makes a convincing argument for using pass-phrases not passwords (via). It’s excellent advice, and I know I’m not alone in having advocated it for years.

My keyboard has 26 letters, 10 numbers, and 12 symbol keys, like ~. All but spacebar make a different symbol when I hold down shift, giving me 93 characters to use in my passwords. But the number of words that can make-up a pass-phrase is easily in the 100,000s. Estimating exactly how big is a bit tricky, but I will stick with 250,000 here (I think it’s an undercount, more on this later).

We Know How To Talk

The human brain has an amazing aptitude for language. But “passwords” aren’t really words, so they don’t tap into this ability. In fact, we often use words to try and remember the nonsense-characters of a password.

Wouldn’t it make more sense to just use the words directly, if we can remember them more easily?

Hard For Computers, Not Hard For Us

People feel that if security system A is harder for them to use then system B, then A must be harder for an attacker to bypass. But the facts don’t always match this intuition.

What authentication code do you think is harder for a bad guy to hack, the 7 character strong password “1Ea.$]/”, or the mnemonic for the first 3 characters, “One Elvis Amazon”? Certainly “1Ea.$]/” is harder for a person to remember. It feels like it should be harder to break. But a computer, not a person, is going to be doing the guessing, and all it cares about is how big the search space is. There are 937 possible 7 character passwords. Let’s say there are 250,000 possible English words (more on that figure later). Then there are 250,0003 3 word combinations — meaning an attacker would have to do 260 times more work to guess “One Elvis Amazon” than to guess “1Ea.$]/”.

With pass phrases, easier for the good guys is also harder for the bad guys.

Exactly How Much Harder

The “250,000 word” figure is a bunch of hand-waiving, but I believe it’s an undercount. I picked it, because I wanted a round number to crunch; it’s what Thomas Baekdal picked; and it’s about the size of the Mac OS X words file,

$ wc -l /usr/share/dict/words 
  234936

But liberally descriptive linguists say that the 1,000,000th word will be added to the English Language on June 10th, 2009. The more conservative Webster’s Third New International Dictionary, Unabridged list 475,000 English words. Obviously neologisms, slang, and archaic terms are fine for pass phrases. People like discovering quirky words. I see far more more people embracing the login, “kilderkin of locats”, then rejecting it.

Different conjugations (can) count as different words in pass-phrases. There’s only one entry in a dictionary for swim, but swim, swimming, swam, etc. make for distinct pass-phrases (eg. “Elvis swims fast”, “Elvis swam fast”, etc. Both phrases don’t show up in a google search by the way.) So the real number of words should be a few fold larger than a dictionary indicates.

But not all words are equally likely to be chosen — just as some characters are more popular in passwords. My earlier figure of “2500003 3 word combinations” was based on the naive assumption that each of the 3 words is independent. But people do not pick things at random. And a phrase is by definition not completely random — it must have some structure. I’m unaware of research into exactly how predictable people are when making-up pass-phrases.

But given how terrible we are at picking good passwords, and how good we are at remembering non-nonsense-words, I am optimistic that we can remember pass-phrases that are orders of magnitude harder to guess than the “good” passwords we can’t remember today.

Fewer Ways To Fail

We’ve all locked ourselves out of an account because of typos or caps lock. But pass-phrases can be more forgiving.

Pass-phrases are caseinsensitive. There’s no need to lock someone out over “ELvis…”.

Common typos can be auto-corrected, much as google automatically suggests words. Consider the authentication attempt “Elvis Swimmms fast”. The system could recognize that “Swimms” isn’t a word, and try the most likely correction, “Elvis Swimms fast” — if it matches, then there’s no reason to ask the user if it’s what they really meant. (Note that only one pass-phrase is checked per login attempt.) I don’t have hard data here, but given how successful google is at interpreting typos, I’d expect such a system to work very well.

Pass-phrases might be more difficult on Phones, and similarly awkward to write with devices. Writing more letters means more work. Predictive text can only do so much. Repeatedly typing 3 letters and accepting a suggestion is clearly more work then just tapping out 6 characters. Additionally, there are security concerns with a predictive text system remembering your pass-phrase, or even a small part of it.

But for computers, pass phrases look like a clear usability win.

Easily Secure Conclusion

(In case you were wondering that was a unique phrase when I wrote this.) Using pass-phrases over passwords (which are really pass-strings-of-nonsense-sybols-that-nobody-can-remember) makes a system significantly harder to crack. Pass-phrases are easier for humans to remember, and a system that uses them can be very forgiving. But as always, the devil is in the details. It’s terrifying to be an early adopter of a new security practice, even if it seems sound.

May 15, 2009

Concise NSDictionary and NSArray Lookup

I started writing a list of ways I thought Objective-C could be improved, and I realized that many of my wishes involved more compact syntax. For example [array objectAtIndex:1] is so verbose I think it diminishes readability, compared to array[1].

I can’t quite match that brevity (can you, by using Objective-C++?), but with a one-line category, you can say, x = [array:1];.

@interface NSArray (ConciseLookup)
- (id):(NSUInteger)index;
@end
@implementation NSArray (ConciseLookup)
- (id):(NSUInteger)index;
{
	return [self objectAtIndex:index];
}
@end

My question is: do you find this compact “syntax” useful at all, or is it added complexity with no substantial code compression? Personally I think the latter, but the number of wishes I had involving more concise Objective-C syntax makes me wonder…

May 14, 2009

Emergent Libraries

I have latched onto an idea, but don’t have the resources to follow up on it: could a static-analysis tool identify repeated patterns of code, across many code bases, that should be extracted out as subroutines and higher-level functions? How universal would these “emergent libraries” be?

My inspiration here is Section 4.1 Identifying Common Functions, in the excellent paper Some Thoughts on Security After Ten Years of qmail 1.0 (PDF), by Daniel J. Bernstein,

Most programmers would never bother to create such a small function. But several words of code are saved whenever one occurrence of the dup2()/close() pattern is replaced with one call to fd_move(); replacing a dozen occurrences saves considerably more code than were spent writing the function itself. (The function is also a natural target for tests.) The same benefit scales to larger systems and to a huge variety of functions; fd_move() is just one example. In many cases an automated scan for common operation sequences can suggest helpful new functions, but even without automation I frequently find myself thinking “Haven’t I seen this before?” and extracting a new function out of existing code.

What’s particularly fascinating to me are the new operations we might find.

Before I was exposed to the Haskell prelude I hadn’t known about the fundamentally useful foldl and foldr operations. I had written dozens of programs that used accumulation, but it’s generalization hadn’t occurred to me — and probably never would have. Static analysis can help uncover generalizations that we might have missed, or didn’t think were important, but turn out in practice to be widely used operations.

April 30, 2009

Acceptable Delays

This is a collection of sources on what constitutes an acceptable delay. It’s very much a work in progress, and will be updated when I stumble into new information. I’m very interested in any insights, experience, or sources you may have.

Based on some experiments I did back at IBM, delays of 1/10th of a second are roughly when people start to notice that an editor is slow. If you can respond is less than 1/10th of a second, people don’t perceive a troublesome delay.

Mark Chu-Carroll

One second … is the required response time for hypertext navigation. Users do not keep their attention on the page if downloading exceeds 10 seconds.

Jakob Nielsen, (in 1997?)

In A/B tests (at Amazon.com), we tried delaying the page in increments of 100 milliseconds and found that even very small delays would result in substantial and costly drops in revenue. (eg 20% drop in traffic when moving from 0.4 to 0.9 second load time for search results).

Greg Linden covering results disclosed by Google VP Marissa Mayer

If a user operates a control and nothing appears on the display for more than approximately 250 msec, she is likely to become uneasy, to try again, or to begin to wonder whether the system is failing.

— Jeff Raskin, The Humane Interface (page 75)

David Eagleman’s blog post Will you perceive the event that kills you? is an engaging look at how slow human perception is, compared to mechanical response time. For example, in a car crash that takes 70ms from impact until airbags begin deflating, the occupants are not aware of the collision until 150-300 milliseconds (possibly as long as 500 milliseconds) after impact.

Older Posts »

Powered by WordPress