Vincent Gable’s Blog

August 19, 2010

The Most Useful Objective-C Code I’ve Ever Written

Actually, it’s the most useful code I’ve extended; credit for the core idea goes to Dave Dribin with his Handy NSString Conversion Macro.

LOG_EXPR(x) is a macro that prints out x, no matter what type x is, without having to worry about format-strings (and related crashes from eg. printing a C-string the same way as an NSString). It works on Mac OS X and iOS. Here are some examples,

LOG_EXPR(self.window.screen);

self.window.screen = <UIScreen: 0x6d20780; bounds = {{0, 0}, {320, 480}}; mode = <UIScreenMode: 0x6d20c50; size = 320.000000 x 480.000000>>

LOG_EXPR(self.tabBarController.viewControllers);

self.tabBarController.viewControllers = (
“<UINavigationController: 0xcd02e00>”,
“<SavingsViewController: 0xcd05c40>”,
“<SettingsViewController: 0xcd05e90>”
)

Pretty straightforward, really. The biggest convenience so far is having the expression printed out, so you don’t have to write out a name redundantly in the format string (eg. NSLog(@"actionURL = %@", actionURL)). But LOG_EXPR really shows it’s worth when you start using scalar or struct expressions:

LOG_EXPR(self.window.windowLevel);

self.window.windowLevel = 0.000000

LOG_EXPR(self.window.frame.size);

self.window.frame.size = {320, 480}

Yes, there are expressions that won’t work, but they’re pretty rare for me. I use LOG_EXPR every day. Several times. It’s not quite as good as having a REPL for Cocoa, but it’s handy.

Give it a try.

How It Works

The problem is how to pick a function or format string to print x, based on the type of x. C++’s type-based dispatch would be a good fit here, but it’s verbose (a full function-definition per type) and I wanted to use pure Objective-C if possible. Fortunately, Objective-C has an @encode() compiler directive that returns a string describing any type it’s given. Unfortunately it works on types, not variables, but with C99 the typeof() compiler directive lets us get the type of any variable, which we can pass to @encode(). The final bit of compiler magic is using stringification (#) to print out the literal string inside LOG_EXPR()‘s parenthesis.

The Macro, Line By Line

1 #define LOG_EXPR(_X_) do{\
2 	__typeof__(_X_) _Y_ = (_X_);\
3 	const char * _TYPE_CODE_ = @encode(__typeof__(_X_));\
4 	NSString *_STR_ = VTPG_DDToStringFromTypeAndValue(_TYPE_CODE_, &_Y_);\
5 	if(_STR_)\
6 		NSLog(@"%s = %@", #_X_, _STR_);\
7 	else\
8 		NSLog(@"Unknown _TYPE_CODE_: %s for expression %s in function %s, file %s, line %d", _TYPE_CODE_, #_X_, __func__, __FILE__, __LINE__);\
9 }while(0)
  1. The first and last lines are a way to put {}‘s around the macro to prevent unintended effects. The do{}while(0); “loop” does nothing else.
  2. First evaluate the expression, _X_, given to LOG_EXPR once, and store the result in a _Y_. We need to use typeof() (which had to be written __typeof__() to appease some versions of GCC) to figure out the type of _Y_.
  3. _TYPE_CODE_ is c-string that describes the type of the expression we want to print out.
  4. Now we have enough information to call a function, VTPG_DDToStringFromTypeAndValue() to convert the expression’s value to a string. We pass it the _TYPE_CODE_ string, and the address of _Y_, which is a pointer, and has a known size. We can’t pass _Y_ directly, because depending on what _X_ is, it will have different types and could be of any size.
  5. VTPG_DDToStringFromTypeAndValue() returns nil if it can’t figure out how to convert a value to a string.
  6. Everything went well, print the stringified expression, #_X_, and the string representing it’s value, _STR_.
  7. otherwise…
  8. The expression had a type we can’t handle, print out a verbose diagnostic message.
  9. See line 1.

The VTPG_DDToStringFromTypeAndValue() Function

See the source in VTPG_Common.m:

It’s derived from Dave Dribin‘s function DDToStringFromTypeAndValue(), and is pretty straightforward: strcmp() the type-string, and if it matches a known type call a function, or use +[NSString stringWithFormat]:, to turn the value into a string.

The First Step Twords Fixing Your Macro Problem is Admitting it…

So yeah, maybe I went a little wild with macros here…

But it took out some WET-ness of the original code, and prevents me from accidentally mixing up types in a long wall of ifs, eg.

else if (strcmp(typeCode, @encode(NSRect)) == 0)
{
    return NSStringFromRect(*(NSRange *)value);
}
else if (strcmp(typeCode, @encode(NSRange)) == 0)
{
    return NSStringFromRect(*(NSRange *)value);
}

If I were cool, I’d use NSDictionarys to map from the @encode-string to an appropriate format string or function pointer. This is conceptually cleaner; less error-prone than using macros; and almost certainly faster. Unfortunately, it gets a little tricky with functions, since I need to deference value into the proper type.

One final note from my testing, I could do away with the strcmp()s, because directly comparing @encode string pointers (eg if(typeCode == @encode(NSString*)) works. I don’t know if it will always work though, so relying on it strikes me as a profoundly Bad Idea. But maybe that bad idea will give someone a good idea.

Limitations

Arrays

C arrays generally muck things up. Casting to a pointer works around this:

char x[14] = "Hello, world!";
//LOG_EXPR(x); //error: invalid initializer
LOG_EXPR((char*)x); //prints fine

__func__

Because it is a static const char [], __func__ (and __FUNCTION__ or __PRETTY_FUNCTION__) need casting to char* to work with LOG_EXPR. Because logging out a function/method call is something I do frequently, I use the macro:

#define LOG_FUNCTION()	NSLog(@"%s", __func__)

long double (Leopard and older)

On older systems, LOG_EXPR won’t work with a long double value, because @encode(long double) gives the same result as @encode(double). This is a known issue with the runtime. The top-level LOG_EXPR macro could detect a long double with if((sizeof(_X_) == sizeof(long double)) && (_TYPE_CODE_ == @encode(double))). But I doubt this will ever be necessary.

I haven’t actually written any code that uses long double, because I use NSDecimal, or another base-10 number format, for situations that require more precision than a double.

Scaling and Frameworks

Growing LOG_EXPR to handle every type is a lot of work. I’ve only added types that I’ve actually needed to print. This has kept the code manageable, and seems to be working so far.

The biggest problem I have is how to deal with types that are in frameworks that not every project includes. Projects that use CoreLocation.framework need to be able to use LOG_EXPR to print out CoreLocation specific structs, like CLLocationCoordinate2D. But projects that don’t use CoreLocation.framework don’t have a definition of the CLLocationCoordinate2D type, so code to convert it to a string won’t compile. There are two ways I’ve tried to solve the problem

Comment-out framework-specific code

This is pretty self-explanatory, I’ll fork VTPG_Common.m and un-comment-out code for types that my project needs to print. It works, but it’s drudgery. Programmers hate that.

Hardcode type info

The idea is to hard-code the string that @encode(SomeType) would evaluate to, and then (since we know how SomeType is laid out in memory) use casting and pointer-arithmetic to get at the fields.

For example:

//This is a hack to print out CLLocationCoordinate2D, without needing to #import <CoreLocation/CoreLocation.h>
//A CLLocationCoordinate2D is a struct made up of 2 doubles.
//We detect it by hard-coding the result of @encode(CLLocationCoordinate2D).
//We get at the fields by treating it like an array of doubles, which it is identical to in memory.
if(strcmp(typeCode, "{?=dd}")==0)//@encode(CLLocationCoordinate2D)
	return [NSString stringWithFormat:@"{latitude=%g,longitude=%g}",((double*)value)[0],((double*)value)[1]];

This Just Works in a project that includes CoreLocation, and doesn’t mess up projects that don’t. Unfortunately it’s horribly brittle. Any Xcode or system update could break it. It’s not a tenable fix.

Areas for Improvement

If there’s some type LOG_EXPR can’t handle that you need, please jump right in and improve it!

When I have time, I plan to write a general parser for @encode()-strings. This will let me print out any struct, which mostly solves the type-defined-in-missing-framework problem, and would let LOG_EXPR Just Work with types from all kinds of POSIX/C libraries.

Using LOG_EXPR() in Your Project

Download VTPG_Common.m and VTPG_Common.h from my github repository, and add them to your Xcode project.

Now just add the line #import "VTPG_Common.h" to your prefix file (named <ProjectName>_Prefix.pch by default), after the #ifdef __OBJC__, for example:

#ifdef __OBJC__
    #import <Foundation/Foundation.h>
    // maybe other files, depending on project  template...
    #import "VTPG_Common.h"
#endif

Now LOG_EXPR() will work everywhere in your project.

July 19, 2010

#define String

When I need a string-constant, I #define it, instead of doing the “right” thing and using an extern const NSString * variable.

UPDATE 2010-07-20

Thanks to Elfred Pagen for pointing out that you should always put () around your macros. Wrong: #define A_STRING @"hello"

instead use (), even when you don’t think you have to:

#define A_STRING (@"hello")

This prevents accidental string concatenation. In C, string-literals separated only by whitespace are implicitly concatenated. It’s the same with Objective-C string literals. This feature lets you break long strings up into several lines, so NSString *x = @"A long string!" can be rewritten:

NSString *x =
	@"A long"
	@" string!";

Unfortunately, this seldom-used feature can backfire in unexpected ways. Consider making an array of two strings:

#define X @"ex"
#define P @"plain"
a = [NSArray arrayWithObjects:X
                              P,
                              nil];

That looks right, but I forgot a “,” after X, so after string-concatenation, a is ['explain'], not ['ex','plain'].

Moral of the story: you can never have too many ()’s in macros.

And, now, back to why I use #define

It’s less code

Using an extern variable means declaring it in a header, and defining it in some implementation file. But a macro is just one line in a header.

It’s faster to lookup

Because there’s only the definition of a macro, Open Quickly/command-double-clicking a macro always jumps to the definition, so you can see what it’s value is in one step. Generally Xcode jumps to a symbol’s declaration first, and then it’s definition, making it slower to lookup the value of a const symbol.

It’s still type safe

An @"NSString literal" has type information, so mistakes like,

#define X (@"immutable string")
NSMutableString *y = X;
[y appendString:@"z"];

still generate warnings.

It lets the compiler check format-strings

Xcode can catch errors like “[NSString stringWithFormat:@"reading garbage since there's no argument: %s"]“, if you let it. Unfortunately, the Objective-C compiler isn’t smart enough to check [NSString stringWithFormat:externConstString,x,y,z]; because it doesn’t know what an extern variable contains until link-time. But preprocessor macros are evaluated early enough in the build process that that the compiler can check their values.

It can’t be changed at runtime

It’s possible to change the value of const variables through pointers, like so:

const NSString* const s = @"initial";
NSString **hack = &s;
*hack = @"changed!";
NSLog(s);//prints "changed!"

Yes this is pathological code, but I’ve seen it happen (I’m looking at you AddressBook.framework!)

Of course, you can re-#define a preprocessor-symbol, so macros aren’t a panacea for pathological constant-changing code. (Nothing is!) But they push the pathology into compile time, and common wisdom is that it’s easier to debug compile-time problems, so that’s a Good Thing. You may disagree there, and you may be right! All I can say for sure is that in my experience, I’ve had bugs from const values changing at runtime, but no bugs from re-#define-ed constants (yet).

Conclusion

Preprocessor macros are damnably dangerous in C. Generally you should avoid them. But for NSString* constants in applications, I think they’re easier, and arguably less error prone. So go ahead and #define YOUR_STRING_CONSTANTS (@"like this").

May 24, 2010

Never Name a Variable “Index”

Filed under: Bug Bite,C++,Cocoa,Objective-C,Programming | ,
― Vincent Gable on May 24, 2010

Never name a variable index, especially in C.

Instead say what it indexes. For example, if it is used to index an array of Foo objects, call it fooArrayIndex, or currentFooIndex.

If the index variable is just used to enumerate over a collection of objects, (eg. for(int i = 0; i < arraySize; i++){…} ) then iterate smarter, using a simpler construct that doesn’t require declaring auxiliary variables. (Eg., in Objective-C use Fast Enumeration). It’s not always possible to do this, but it’s always a good idea to try.1

Why index is Especially Bad in C

The standard strings.h header declares a function named index, that finds the first occurrence of a charicter in a C-string. In practical terms every C program will have the index function declared everywhere.

But when a variable is declared with the name index it shadows the function — meaning the local variable named index takes over the name index, so the function can’t be called anymore:

char * world = index("Hello, World", 'W');
NSLog(@"'%s'", world);

Prints “‘World'”, but

int index = 0;
char * world = index("Hello, World", 'W');
NSLog(@"'%s'", world);

Won’t compile, because an int isn’t a function.

Obviously this is a problem for code that uses the index() function — but honestly modern code probably uses a safer, unicode-aware string parsing function instead. What’s given me the most trouble is that shadowing index makes the compiler give lots of bogus warnings, if you have the useful GCC_WARN_SHADOW warning turned on.

There are other good reasons as, specific to Objective-C, which Peter Hosey covers.

1If you really can’t think of a better name than “index”, I prefer the more terse i. It sucks, but at least it’s shorter. Brevity is a virtue.

December 25, 2009

A C &Puzzler[]

Filed under: Announcement,Bug Bite,C++,Objective-C,Programming | , , ,
― Vincent Gable on December 25, 2009

Here’s a C-puzzler for you!

given this function,

void foo(char* s){
	printf("s is at: %p\n s is: '%s'\n", s, s);
}

and that

char s[] = "Joy!";
foo(s);

prints out

s is at: 0xbffff46b
s is: ‘Joy!’

what will this next line print?

foo(&s); //WHAT WILL THIS DO?

Pick all that apply:

  1. Print “Joy!”
  2. Print garbage
  3. Print the same address for s
  4. Print the a different address for s
  5. Crash
  6. Go into an Infinite loop

Answer

Answer: one and three

Yeah, it’s not what I expected either, especially since:

@encode(__typeof__(s)) = [5c]
@encode(__typeof__(&s)) = ^[5c]

In fact, all of these are equvalent (modulo type warnings):

foo(s);
foo(&s[0]);
foo(&(*s));
foo(&s);

Explanation.

October 20, 2009

JavaScript Nailed ||

One thing about JavaScript I really like is that its ||, the Logical Or operator, is really a more general ‘Eval Until True‘ operation. (If you have a better name for this operation, please leave a comment!) It’s the same kind of or operator used in Lisp. And I believe it’s the best choice for a language to use.

In C/C++, a || b is equivalent to,

  if a evaluates to a non-zero value:
    return true;
  if b evaluates to a non-zero value:
    return true;
  otherwise:
    return false;

Note that if a can be converted to true, then b is not evaluated. Importantly, in C/C++ || always returns a bool.

But the JavaScript || returns the value of the first variable that can be converted to true, or the last variable if both variables can’t be interpreted as true,

  if a evaluates to a non-zero value:
    return a;
  otherwise:
    return b;

Concise

JavaScript’s || is some sweet syntactic sugar.

We can write,

return playerName || "Player 1";

instead of,

return playerName ? playerName : "Player 1";

And simplify assert-like code in a perl-esq way,

x || throw "x was unexpectedly null!";

It’s interesting that a more concise definition of || allows more concise code, even though intuitively we’d expect a more complex || to “do more work for us”.

General

Defining || to return values, not true/false, is much more useful for functional programming.

The short-circuit-evaluation is powerful enough to replace if-statements. For example, the familiar factorial function,

function factorial(n){
	if(n == 0) return 1;
	return n*factorial(n-1);
}

can be written in JavaScript using && and || expressions,

function factorial2(n){ return n * (n && factorial2(n-1)) || 1;}

Yes, I know this isn’t the clearest way to write a factorial, and it would still be an expression if it used ?:, but hopefully this gives you a sense of what short-circuiting operations can do.

Unlike ?:, the two-argument || intuitively generalizes to n arguments, equivalent to a1 || a2 || ... || an. This makes it even more useful for dealing with abstractions.

Logical operators that return values, instead of simply booleans, are more expressive and powerful, although at first they may not seem useful — especially coming from a language without them.

October 19, 2009

sizeof() Style

Filed under: Bug Bite,C++,Objective-C,Programming,Tips | , ,
― Vincent Gable on October 19, 2009

Never say sizeof(sometype) when you can say sizeof(a_variable). The latter works even if the type of a_variable changes, and it is much more obvious what the size is supposed to represent.

October 16, 2009

Hack: Counting Variadic Arguments in C

This isn’t practical, but I think it’s neat that it’s doable in C99. The implementation I present here is incomplete and for illustrative purposes only.

Background

C’s implementation of variadic functions (functions that take a variable-number of arguments) is characteristically bare-bones. Even though the compiler knows the number, and type, of all arguments passed to variadic functions; there isn’t a mechanism for the function to get this information from the compiler. Instead, programmers need to pass an extra argument, like the printf format-string, to tell the function “these are the arguments I gave you”. This has worked for over 37 years. But it’s clunky — you have to write the same information twice, once for the compiler and again to tell the function what you told the compiler.

Inspecting Arguments in C

Argument Type

I don’t know of a way to find the type of the Nth argument to a varadic function, called with heterogeneous types. If you can figure out a way, I’d love to know. The typeof extension is often sufficient to write generic code that works when every argument has the same type. (C++ templates also solve this problem if we step outside of C-proper.)

Argument Count (The Good Stuff Starts Here)

By using variadic macros, and stringification (#), we can actually pass a function the literal string of its argument list from the source code — which it can parse to determine how many arguments it was given.

For example, say f() is a variadic function. We create a variadic wrapper macro, F() and call it like so in our source code,

x = F(a,b,c);

The preprocessor expands this to,

x = f("a,b,c",a,b,c)

Or perhaps,

x = f(count_arguments("a,b,c"),a,b,c)

where count_arguments(char *s) returns the number of arguments in the string source-code string s. (Technically s would be an argument-expression-list).

Example Code

Here’s an implementation for, iArray(), an array-builder for int values, very much like JavaScript‘s Array() constructor. Unlike the quirky JavaScript Array(), iArray(3) returns an array containing just the element 3, [3], not an uninitilized array with 3 elements, [undefined, undefined, undefined]. Another difference: iArray(), invoked with no arguments, is invalid, and will not compile.

#define iArray(...) alloc_ints(count_arguments(#__VA_ARGS__), __VA_ARGS__)

This macro is pretty straightforward. It’s given a variable number of arguments, represented by __VA_ARGS__ in the expansion. #__VA_ARGS__ turns the code into a string so that count_arguments can analyze it. (If you were doing this for real, you should use two levels of stringification though, otherwise macros won’t be fully expanded. I choose to keep things “demo-simple” here.)

unsigned count_arguments(char *s){
	unsigned i,argc = 1;
		for(i = 0; s[i]; i++)
			if(s[i] == ',')
				argc++;
	return argc;
}

This is a dangerously naive implementation and only works correctly when iArray() is given a straightforward non-empty list of values or variables. Basically it’s the least code I could write to make a working demo.

Since iArray must have at least one argument to compile, we just count the commas in the argument-list to see how many other arguments were passed. Simple to code, but it fails for more complex expressions like f(a,g(b,c)).

int *alloc_ints(unsigned count, ...){
	unsigned i = 0;
	int *ints = malloc(sizeof(int) * count);
	va_list args;
    va_start(args, count);
	for(i = 0; i < count; i++)
		ints[i] = va_arg(args,int);
	va_end(args);
	return ints;
}

Just as you'd expect, this code allocates enough memory to hold count ints, and fills it with the remaining count arguments. Bad things happen if < count arguments are passed, or they are the wrong type.

Download the code, if you like.

Parsing is Hard, Let's Go Shopping

I didn't even try to correctly parse any valid argument-expression-list in count_arguments. It's non trivial. I'd rather deal with choosing the correct MAX3 or MAX4 macro in a few places than maintain such a code base.

So this kind of introspection isn't really practical in C. But it's neat that it can be done, without any tinkering with the compiler or language.

October 12, 2009

Don’t Check malloc()

Filed under: C++,Cocoa,iPhone,MacOSX,Objective-C,Programming,Quotes,Tips | , , ,
― Vincent Gable on October 12, 2009

There’s no point in trying to recover from a malloc failure on OS X, because by the time you detect the failure and try to recover, your process is likely to already be doomed. There’s no need to do your own logging, because malloc itself does a good job of that. And finally there’s no real need to even explicitly abort, because any malloc failure is virtually guaranteed to result in an instantaneous crash with a good stack trace.

Mike Ash

This is excellent advice. Peppering your code with if statements harms readability and simplicity.

It’s still a good idea to check large (many MB) mallocs, but I can’t imagine recovering gracefully from a situation where 32 byte memory allocations are failing on a modern desktop.

May 14, 2009

Emergent Libraries

I have latched onto an idea, but don’t have the resources to follow up on it: could a static-analysis tool identify repeated patterns of code, across many code bases, that should be extracted out as subroutines and higher-level functions? How universal would these “emergent libraries” be?

My inspiration here is Section 4.1 Identifying Common Functions, in the excellent paper Some Thoughts on Security After Ten Years of qmail 1.0 (PDF), by Daniel J. Bernstein,

Most programmers would never bother to create such a small function. But several words of code are saved whenever one occurrence of the dup2()/close() pattern is replaced with one call to fd_move(); replacing a dozen occurrences saves considerably more code than were spent writing the function itself. (The function is also a natural target for tests.) The same benefit scales to larger systems and to a huge variety of functions; fd_move() is just one example. In many cases an automated scan for common operation sequences can suggest helpful new functions, but even without automation I frequently find myself thinking “Haven’t I seen this before?” and extracting a new function out of existing code.

What’s particularly fascinating to me are the new operations we might find.

Before I was exposed to the Haskell prelude I hadn’t known about the fundamentally useful foldl and foldr operations. I had written dozens of programs that used accumulation, but it’s generalization hadn’t occurred to me — and probably never would have. Static analysis can help uncover generalizations that we might have missed, or didn’t think were important, but turn out in practice to be widely used operations.

April 19, 2009

Beware rangeOf NSString Operations

Filed under: Bug Bite,iPhone,Objective-C,Programming,Sample Code | , , , ,
― Vincent Gable on April 19, 2009

I have repeatedly had trouble with the rageOfNSString methods, because they return a struct. Going forward I will do more to avoid them, here are some ways I plan to do it.

Sending a message that returns a struct to nil can “return” undefined values. With small structs like NSRange, you are more likely to get {0} on Intel, compared to PowerPC and iPhone/ARM. Unfortunately, this makes nil-messaging bugs hard to detect. In my experience you will miss them when running on the simulator, even if they are 100% reproducible on an actual iPhone.

This category method has helped me avoid using -rangeOfString: dangerously,

@implementation NSString  (HasSubstring)
- (BOOL) hasSubstring:(NSString*)substring;
{
	if(IsEmpty(substring))
		return NO;
	NSRange substringRange = [self rangeOfString:substring];
	return substringRange.location != NSNotFound && substringRange.length > 0;
}
@end

I choose to define [aString hasSubstring:@""] as NO. You might prefer to throw an exception, or differentiate between @"" and nil. But I don’t think a nil string is enough error to throw an exception. And even though technically any string contains the empty string, I generally treat @"" as semantically equivalent to nil.

As I’ve said before,

A few simple guidelines can help you avoid my misfortune:

  • Be especially careful using of any objective-C method that returns a double, struct, or long long
  • Don’t write methods that return a double, struct, orlong long. Return an object instead of a struct; an NSNumber* or float instead of a double or long long. If you must return a dangerous data type, then see if you can avoid it. There really isn’t a good reason to return a struct, except for efficiency. And when micro-optimizations like that matter, it makes more sense to write that procedure in straight-C, which avoids the overhead of Objective-C message-passing, and solves the undefined-return-value problem.
  • But if you absolutely must return a dangerous data type, then return it in a parameter. That way you can give it a default value of your choice, and won’t have undefined values if an object is unexpectedly nil.
    Bad:
    - (struct CStructure) evaluateThings;
    Good:
    - (void) resultOfEvaluatingThings:(struct CStructure*)result;.

It’s not a bad idea to wrap up all the rangeOf methods in functions or categories that play safer with nil.

Thanks to Greg Parker for corrections!

Older Posts »

Powered by WordPress