Vincent Gable’s Blog

October 16, 2009

Hack: Counting Variadic Arguments in C

This isn’t practical, but I think it’s neat that it’s doable in C99. The implementation I present here is incomplete and for illustrative purposes only.

Background

C’s implementation of variadic functions (functions that take a variable-number of arguments) is characteristically bare-bones. Even though the compiler knows the number, and type, of all arguments passed to variadic functions; there isn’t a mechanism for the function to get this information from the compiler. Instead, programmers need to pass an extra argument, like the printf format-string, to tell the function “these are the arguments I gave you”. This has worked for over 37 years. But it’s clunky — you have to write the same information twice, once for the compiler and again to tell the function what you told the compiler.

Inspecting Arguments in C

Argument Type

I don’t know of a way to find the type of the Nth argument to a varadic function, called with heterogeneous types. If you can figure out a way, I’d love to know. The typeof extension is often sufficient to write generic code that works when every argument has the same type. (C++ templates also solve this problem if we step outside of C-proper.)

Argument Count (The Good Stuff Starts Here)

By using variadic macros, and stringification (#), we can actually pass a function the literal string of its argument list from the source code — which it can parse to determine how many arguments it was given.

For example, say f() is a variadic function. We create a variadic wrapper macro, F() and call it like so in our source code,

x = F(a,b,c);

The preprocessor expands this to,

x = f("a,b,c",a,b,c)

Or perhaps,

x = f(count_arguments("a,b,c"),a,b,c)

where count_arguments(char *s) returns the number of arguments in the string source-code string s. (Technically s would be an argument-expression-list).

Example Code

Here’s an implementation for, iArray(), an array-builder for int values, very much like JavaScript’s Array() constructor. Unlike the quirky JavaScript Array(), iArray(3) returns an array containing just the element 3, [3], not an uninitilized array with 3 elements, [undefined, undefined, undefined]. Another difference: iArray(), invoked with no arguments, is invalid, and will not compile.

#define iArray(...) alloc_ints(count_arguments(#__VA_ARGS__), __VA_ARGS__)

This macro is pretty straightforward. It’s given a variable number of arguments, represented by __VA_ARGS__ in the expansion. #__VA_ARGS__ turns the code into a string so that count_arguments can analyze it. (If you were doing this for real, you should use two levels of stringification though, otherwise macros won’t be fully expanded. I choose to keep things “demo-simple” here.)

unsigned count_arguments(char *s){
	unsigned i,argc = 1;
		for(i = 0; s[i]; i++)
			if(s[i] == ',')
				argc++;
	return argc;
}

This is a dangerously naive implementation and only works correctly when iArray() is given a straightforward non-empty list of values or variables. Basically it’s the least code I could write to make a working demo.

Since iArray must have at least one argument to compile, we just count the commas in the argument-list to see how many other arguments were passed. Simple to code, but it fails for more complex expressions like f(a,g(b,c)).

int *alloc_ints(unsigned count, ...){
	unsigned i = 0;
	int *ints = malloc(sizeof(int) * count);
	va_list args;
    va_start(args, count);
	for(i = 0; i < count; i++)
		ints[i] = va_arg(args,int);
	va_end(args);
	return ints;
}

Just as you'd expect, this code allocates enough memory to hold count ints, and fills it with the remaining count arguments. Bad things happen if < count arguments are passed, or they are the wrong type.

Download the code, if you like.

Parsing is Hard, Let's Go Shopping

I didn't even try to correctly parse any valid argument-expression-list in count_arguments. It's non trivial. I'd rather deal with choosing the correct MAX3 or MAX4 macro in a few places than maintain such a code base.

So this kind of introspection isn't really practical in C. But it's neat that it can be done, without any tinkering with the compiler or language.

May 31, 2009

The Best Quicksort Ever

Filed under: Design, Programming, Sample Code — Tags: , , , , — Vincent Gable @ 1:59 am

The first time I saw this quicksort in Haskell was an eye opening moment,

qsort []     = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)

The first line reads: “When you sort an empty list ([]), the result is another empty list”. The second line reads: “To sort a list whose first element is named x and the rest of which is named xs, sort the elements of xs that are less than x, sort the elements of xs that are greater than or equal to x, and concatenate (++) the results, with x sandwiched in the middle.”

The code is so concise, yet clear (even with cryptic variable names like xs). The day my professor wrote it on the whiteboard was the first time I internalized that there might be something good about the alien world of functional programming.

The biggest revelation to me was, filter (< x) xs . It’s amazing that (< x) builds a temporary, unnamed, function equivalent to C++,

bool lessThanX(AnyOrderedType y){
    return y < X;
}
//plus the definition of X somewhere...

Building a function! It’s still profound stuff to me. And clearly it really uses fewer lines of code. Amazingly, using list comprehension syntax is not more concise,

qsort (x:xs) = qsort [i | i <- xs, i < x] ++ [x] ++ qsort [i | i <- xs, i <= x]

compared to the original,

qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)

I hope this is an ah-ah moment for someone else too. I’d love to know what made you interested in functional programming; or if you know of a more elegant quicksort implementation.

May 19, 2009

Improving Twitter.com: Space to Work

Filed under: Design, Sample Code, Tips, Usability — Tags: , , , , , — Vincent Gable @ 2:14 pm

The Change

Enlarge the “What are you doing” box on Twitter.com, to make compressing substantial ideas easier.

Twitter.com with a larger text-field

Motivation

I’ve been disappointed with the posting interface of every Twitter-client I’ve tried so far. Just like any writing, tweets start with a first draft. My first drafts are often longer than 140 characters. That shouldn’t be a problem; trimming the fat is part of any editing process. But most Twitter-interfaces are so downright hostile to anything longer then 140 characters that trimming a 145 letter utterance is a frustrating study in fighting my tools.

(The worst client I tried was, Blogo, which would stop you from typing and yell at you with a dialog if you dared press another key after typing 140 characters. But Twitterrific was little better; I don’t understand how something so user-unfriendly became so popular.)

Even Twitter.com doesn’t give you enough room for writing a long, but under-the-limit tweet. To see for yourself, just start typing “mmmmm”; the box will run out of room before you run out of characters. It’s downright crazy to have to scroll to see all of a tweet you are writing.

Now there’s nothing wrong with trying to prescribe a pithy style of communication. Clearly Twitter wouldn’t have worked otherwise. But punishing users for doing the “wrong” thing isn’t as effective as giving them the tools to change their behavior, to wit: space to work on shortening their writing.

The Code

This CSS code makes the direct-messaging, and “what are you doing?” text-boxes tall enough to hold 5 lines of text without scrolling. By default Twitter’s web interface only holds 2 lines of text on screen.

#dm_update_box #direct_message_form fieldset div.info textarea#text,
#status_update_box #status_update_form fieldset div.info textarea#status {
	height: 6em !important;
}

The selectors I used are pretty specific to Twitter.com, so it’s unlikely this will interfere with another site’s layout, unless it’s HTML code is nearly identical to Twitter’s.

How-To: Safari

Copy the above code into a .css file, (“CustomSafari.css” is what I called mine) then select that file in Safari -> Preferences -> Advanced -> Style sheet:
safariStyleSheet.png

After restarting Safari, Twitter’s web interface should give you room to work.

May 15, 2009

Concise NSDictionary and NSArray Lookup

Filed under: Objective-C, Programming, Research, Sample Code, Usability — Tags: , , — Vincent Gable @ 2:13 pm

I started writing a list of ways I thought Objective-C could be improved, and I realized that many of my wishes involved more compact syntax. For example [array objectAtIndex:1] is so verbose I think it diminishes readability, compared to array[1].

I can’t quite match that brevity (can you, by using Objective-C++?), but with a one-line category, you can say, x = [array:1];.

@interface NSArray (ConciseLookup)
- (id):(NSUInteger)index;
@end
@implementation NSArray (ConciseLookup)
- (id):(NSUInteger)index;
{
	return [self objectAtIndex:index];
}
@end

My question is: do you find this compact “syntax” useful at all, or is it added complexity with no substantial code compression? Personally I think the latter, but the number of wishes I had involving more concise Objective-C syntax makes me wonder…

April 24, 2009

How To Check if an Application is Running With AppleScript

Filed under: Programming, Sample Code — Tags: , — Vincent Gable @ 7:46 pm
on ApplicationIsRunning(appName)
	tell application "System Events" to set appNameIsRunning to exists (processes where name is appName)
	return appNameIsRunning
end ApplicationIsRunning

Use it like,

if ApplicationIsRunning("Mail") then
	display dialog "Don't forget to write mom!"
end if

On Mac OS X 10.5, this worked for me even when the application-file’s name was changed. On 10.4 I do not expect that it would still work if someone renamed the application, unless you used the creator type to locate the process, not the name.

You might also be interested in how to get the name of the frontmost application, here’s how:

tell application "System Events" to set FrontAppName to name of first process where frontmost is true
if FrontAppName is "DVD Player" then
	display dialog "Get to work!"
end if

April 22, 2009

-[NSURL isEqual:] Gotcha

Filed under: Bug Bite, Cocoa, MacOSX, Programming, Sample Code, iPhone — Tags: , , , , , , — Vincent Gable @ 7:05 pm

BREAKING UPDATE: Actually comparing the -absoluteURL or -absoluteString of two NSURLs that represent a file is not good enough. One may start file:///, and the other file://localhost/, and they will not be isEqual:! A work around is to compare the path of each NSURL. I’m still looking into the issue, but for now I am using the following method to compare NSURLs.

@implementation NSURL (IsEqualTesting)
- (BOOL) isEqualToURL:(NSURL*)otherURL;
{
	return [[self absoluteURL] isEqual:[otherURL absoluteURL]] ||
	[self isFileURL] && [otherURL isFileURL] &&
	([[self path] isEqual:[otherURL path]]);
}
@end

[a isEqual:b] may report NO for two NSURLs that both resolve to the same resource (website, file, whatever). So compare NSURLs like [[a absoluteString] isEqual:[b absoluteString]]. It’s important to be aware of this gotcha, because URLs are Apple’s preferred way to represent file paths, and APIs are starting to require them. Equality tests that worked for NSString file-paths may fail with NSURL file-paths.

The official documentation says

two NSURLs are considered equal if they both have the same base baseURL and relativeString.

Furthermore,

An NSURL object is composed of two parts—a potentially nil base URL and a string that is resolved relative to the base URL. An NSURL object whose string is fully resolved without a base is considered absolute; all others are considered relative.

In other words, two NSURL objects can resolve to the same absolute URL, but have a different base URL, and be considered !isEqual:.

An example should make this all clear,

NSURL *VGableDotCom = [NSURL URLWithString:@"http://vgable.com"];
NSURL *a = [[NSURL alloc] initWithString:@"blog" relativeToURL:VGableDotCom];
NSURL *b = [[NSURL alloc] initWithString:@"http://vgable.com/blog" relativeToURL:nil];
LOG_INT([a isEqual:b]);
LOG_INT([[a absoluteURL] isEqual:[b absoluteURL]]);
LOG_ID([a absoluteURL]);
LOG_ID([b absoluteURL]);

[a isEqual:b] = 0
[[a absoluteURL] isEqual:[b absoluteURL]] = 1
[a absoluteURL] = http://vgable.com/blog
[b absoluteURL] = http://vgable.com/blog

Remember that collections use isEqual: to determine equality, so you may have to convert an NSURL to an absoluteURL to get the behavior you expect, especially with NSSet and NSDictionary.

Getting the Current URL from a WebView

Filed under: Cocoa, MacOSX, Objective-C, Programming, Sample Code, iPhone — Tags: , , , — Vincent Gable @ 1:52 pm

UPDATED 2009-04-30: WARNING: this method will not always give the correct result. +[NSURL URLWithString:] requires it’s argument to have unicode characters %-escaped UTF8. But stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding will convert # to %23, so http://example.com/index.html#s1 would become http://example.com/index.html%23s1. Unfortunately, the two URLs are not equivalent. The un-%-escaped one refers to section #s1 in the file index.html, and the other tries to fetch the file index.html#s1 (“index dot html#s1″). I have not yet implemented a workaround, although I suspect one is possible, by building the NSURL out of bits of the JavaScript location object, rather then trying to convert the whole string.


UIWebView/WebView does not provide a way to find the URL of the webpage it is showing. But there’s a simple (and neat) way to get it using embedded JavaScript.

- (NSString *)stringByEvaluatingJavaScriptFromString:(NSString *)script

Is a deceptively powerful method that can execute dynamically constructed JavaScript, and lets you embed JavaScript snippets in Cocoa programs. We can use it to embed one line of JavaScript to ask a UIWebView for the URL it’s showing.

@interface UIWebView (CurrentURLInfo)
- (NSURL*) locationURL;
@end
@implementation UIWebView (CurrentURLInfo)
- (NSURL*) locationURL;
{
	NSString *rawLocationString = [self stringByEvaluatingJavaScriptFromString:@"location.href;"];
	if(!rawLocationString)
		return nil;
	//URLWithString: needs percent escapes added or it will fail with, eg. a file:// URL with spaces or any URL with unicode.
	locationString = [locationString stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
	return [NSURL URLWithString:locationString]
}
@end

With the CurrentURLInfo category, you can do aWebView.locationURL to get the URL of the page a WebView is showing.


One last note, the

if(!rawLocationString)
	return nil;

special-case was only necessary, because [NSURL URLWithString:nil] throws an exception (rdar://6810626). But Apple has decided that this is correct behavior.

April 19, 2009

Beware rangeOf NSString Operations

Filed under: Bug Bite, Objective-C, Programming, Sample Code, iPhone — Tags: , , , , — Vincent Gable @ 10:42 am

I have repeatedly had trouble with the rageOfNSString methods, because they return a struct. Going forward I will do more to avoid them, here are some ways I plan to do it.

Sending a message that returns a struct to nil can “return” undefined values. With small structs like NSRange, you are more likely to get {0} on Intel, compared to PowerPC and iPhone/ARM. Unfortunately, this makes nil-messaging bugs hard to detect. In my experience you will miss them when running on the simulator, even if they are 100% reproducible on an actual iPhone.

This category method has helped me avoid using -rangeOfString: dangerously,

@implementation NSString  (HasSubstring)
- (BOOL) hasSubstring:(NSString*)substring;
{
	if(IsEmpty(substring))
		return NO;
	NSRange substringRange = [self rangeOfString:substring];
	return substringRange.location != NSNotFound && substringRange.length > 0;
}
@end

I choose to define [aString hasSubstring:@""] as NO. You might prefer to throw an exception, or differentiate between @"" and nil. But I don’t think a nil string is enough error to throw an exception. And even though technically any string contains the empty string, I generally treat @"" as semantically equivalent to nil.

As I’ve said before,

A few simple guidelines can help you avoid my misfortune:

  • Be especially careful using of any objective-C method that returns a double, struct, or long long
  • Don’t write methods that return a double, struct, orlong long. Return an object instead of a struct; an NSNumber* or float instead of a double or long long. If you must return a dangerous data type, then see if you can avoid it. There really isn’t a good reason to return a struct, except for efficiency. And when micro-optimizations like that matter, it makes more sense to write that procedure in straight-C, which avoids the overhead of Objective-C message-passing, and solves the undefined-return-value problem.
  • But if you absolutely must return a dangerous data type, then return it in a parameter. That way you can give it a default value of your choice, and won’t have undefined values if an object is unexpectedly nil.
    Bad:
    - (struct CStructure) evaluateThings;
    Good:
    - (void) resultOfEvaluatingThings:(struct CStructure*)result;.

It’s not a bad idea to wrap up all the rangeOf methods in functions or categories that play safer with nil.

Thanks to Greg Parker for corrections!

April 10, 2009

Percent Escapes Gotcha

Filed under: Bug Bite, Cocoa, Programming, Sample Code — Tags: , — Vincent Gable @ 10:25 am

If you use stringByAddingPercentEscapesUsingEncoding: more than once on a string, the resulting string will not decode correctly from just one call to stringByReplacingPercentEscapesUsingEncoding:. (stringByAddingPercentEscapesUsingEncoding: is not indempotent).

NSString *string = @"100%";
NSString *escapedOnce = [string stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSString *escapedTwice = [escapedOnce stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSLog(@"%@ escaped once: %@, escaped twice: %@", string, escapedOnce, escapedTwice);

100% escaped once: 100%25, escaped twice: 100%2525

I thought I was programming defensively by eagerly adding percent-escapes to any string that would become part of a URL. But this caused some annoying bugs resulting form a string being percent-escaped more then once. My solution was to create an indempotent replacement for stringByAddingPercentEscapesUsingEncoding: (I also simplified things a little by removing the encoding parameter, because I never used any encoding other then NSUTF8StringEncoding),

@implementation NSString (IndempotentPercentEscapes)
- (NSString*) stringByReplacingPercentEscapesOnce;
{
	NSString *unescaped = [self stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
	//self may be a string that looks like an invalidly escaped string,
	//eg @"100%", in that case it clearly wasn't escaped,
	//so we return it as our unescaped string.
	return unescaped ? unescaped : self;
}

- (NSString*) stringByAddingPercentEscapesOnce;
{
	return [[self stringByReplacingPercentEscapesOnce] stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
}
@end

Usage example,

NSString *string = @"100%";
NSString *escapedOnce = [string stringByAddingPercentEscapesOnce];
NSString *escapedTwice = [escapedOnce stringByAddingPercentEscapesOnce];
NSLog(@"%@ escaped once: %@, escaped twice: %@", string, escapedOnce, escapedTwice);

100% escaped once: 100%25, escaped twice: 100%25

The paranoid have probably noticed that [aBadlyEncodedString stringByReplacingPercentEscapesOnce] will return aBadlyEncodedString not nil, This could make it harder to detect an error.

But it’s not something that I’m worried about for my application. Since I only ever use a UTF8 encoding, and it can represent any unicode character, it’s not possible to have an invalid string. But it’s certainly something to be aware of in situations where you might have strings with different encodings.

March 31, 2009

How To Write Cocoa Object Getters

Setters are more straightforward than getters, because you don’t need to worry about memory management.

The best practice is to let the compiler write getters for you, by using Declared Properties.

But when I have to implement a getter manually, I prefer the (to my knowledge) safest pattern,

- (TypeOfX*) x;
{
  return [[x retain] autorelease];
}

Note that by convention in Objective-C, a getter for the variable jabberwocky is simply called jabberwocky, not getJabberwocky.

Why retain Then autorelease

Basically return [[x retain] autorelease]; guarantees that what the getter returns will be valid for as long as any local objects in the code that called the the getter.

Consider,

NSString* oldName = [person name];
[person setName:@"Alice"];
NSLog(@"%@ has changed their name to Alice", oldName);

If -setName: immediately releasees the value that -name returned, oldName will be invalid when it’s used in NSLog. But if the implementation of [x name] used retain/autorelease, then oldName would still be valid, because it would not be destroyed until the autorelease pool around the NSLog was drained.

Also, autorelease pools are per thread; different threads have different autorelease pools that are drained at different times. retain/autorelease makes sure the object is on the calling thread’s pool.

If this cursory explanation isn’t enough, read Seth Willitis’ detailed explanation of retain/autorelease. I’m not going to explain it further here, because he’s done such a through job of it.

Ugly

return [[x retain] autorelease]; is more complicated, and harder to understand then a simple return x;. But sometimes that ugliness is necessary, and the best place to hide it is in a one-line getter method. It’s self documenting. And once you understand Cocoa memory management, it’s entirely clear what the method does. For me, the tiny readability cost is worth the safety guarantee.

Big

return [[x retain] autorelease]; increases peak memory pressure, because it can defer dealloc-ing unused objects until a few autorelease pools are drained. Honestly I’ve never measured memory usage, and found this to be a significant problem. It certainly could be, especially if the thing being returned is a large picture or chunk of data. But in my experience, it’s nothing to worry about for getters that return typical objects, unless there are measurements saying otherwise.

Slow

return [[x retain] autorelease]; is obviously slower then just return x;. But I doubt that optimizing an O(1) getter is going to make a significant difference to your application’s performance — especially compared to other things you could spend that time optimizing. So until I have data telling me otherwise, I don’t worry about adding an the extra method calls.

This is a Good Rule to Break

As I mentioned before, getters don’t need to worry about memory management. It could be argued that the return [[x retain] autorelease]; pattern is a premature optimization of theoretical safety at the expense of concrete performance.

Good programmers try to avoid premature optimization; so perhaps I’m wrong to follow this “safer” pattern. But until I have data showing otherwise, I like to do the safest thing.

How do you write getters, and why?

Older Posts »

Powered by WordPress