Vincent Gable’s Blog

April 22, 2009

-[NSURL isEqual:] Gotcha

Filed under: Bug Bite, Cocoa, MacOSX, Programming, Sample Code, iPhone — Tags: , , , , , , — Vincent Gable @ 7:05 pm

BREAKING UPDATE: Actually comparing the -absoluteURL or -absoluteString of two NSURLs that represent a file is not good enough. One may start file:///, and the other file://localhost/, and they will not be isEqual:! A work around is to compare the path of each NSURL. I’m still looking into the issue, but for now I am using the following method to compare NSURLs.

@implementation NSURL (IsEqualTesting)
- (BOOL) isEqualToURL:(NSURL*)otherURL;
{
	return [[self absoluteURL] isEqual:[otherURL absoluteURL]] ||
	[self isFileURL] && [otherURL isFileURL] &&
	([[self path] isEqual:[otherURL path]]);
}
@end

[a isEqual:b] may report NO for two NSURLs that both resolve to the same resource (website, file, whatever). So compare NSURLs like [[a absoluteString] isEqual:[b absoluteString]]. It’s important to be aware of this gotcha, because URLs are Apple’s preferred way to represent file paths, and APIs are starting to require them. Equality tests that worked for NSString file-paths may fail with NSURL file-paths.

The official documentation says

two NSURLs are considered equal if they both have the same base baseURL and relativeString.

Furthermore,

An NSURL object is composed of two parts—a potentially nil base URL and a string that is resolved relative to the base URL. An NSURL object whose string is fully resolved without a base is considered absolute; all others are considered relative.

In other words, two NSURL objects can resolve to the same absolute URL, but have a different base URL, and be considered !isEqual:.

An example should make this all clear,

NSURL *VGableDotCom = [NSURL URLWithString:@"http://vgable.com"];
NSURL *a = [[NSURL alloc] initWithString:@"blog" relativeToURL:VGableDotCom];
NSURL *b = [[NSURL alloc] initWithString:@"http://vgable.com/blog" relativeToURL:nil];
LOG_INT([a isEqual:b]);
LOG_INT([[a absoluteURL] isEqual:[b absoluteURL]]);
LOG_ID([a absoluteURL]);
LOG_ID([b absoluteURL]);

[a isEqual:b] = 0
[[a absoluteURL] isEqual:[b absoluteURL]] = 1
[a absoluteURL] = http://vgable.com/blog
[b absoluteURL] = http://vgable.com/blog

Remember that collections use isEqual: to determine equality, so you may have to convert an NSURL to an absoluteURL to get the behavior you expect, especially with NSSet and NSDictionary.

March 31, 2009

How To Write Cocoa Object Getters

Setters are more straightforward than getters, because you don’t need to worry about memory management.

The best practice is to let the compiler write getters for you, by using Declared Properties.

But when I have to implement a getter manually, I prefer the (to my knowledge) safest pattern,

- (TypeOfX*) x;
{
  return [[x retain] autorelease];
}

Note that by convention in Objective-C, a getter for the variable jabberwocky is simply called jabberwocky, not getJabberwocky.

Why retain Then autorelease

Basically return [[x retain] autorelease]; guarantees that what the getter returns will be valid for as long as any local objects in the code that called the the getter.

Consider,

NSString* oldName = [person name];
[person setName:@"Alice"];
NSLog(@"%@ has changed their name to Alice", oldName);

If -setName: immediately releasees the value that -name returned, oldName will be invalid when it’s used in NSLog. But if the implementation of [x name] used retain/autorelease, then oldName would still be valid, because it would not be destroyed until the autorelease pool around the NSLog was drained.

Also, autorelease pools are per thread; different threads have different autorelease pools that are drained at different times. retain/autorelease makes sure the object is on the calling thread’s pool.

If this cursory explanation isn’t enough, read Seth Willitis’ detailed explanation of retain/autorelease. I’m not going to explain it further here, because he’s done such a through job of it.

Ugly

return [[x retain] autorelease]; is more complicated, and harder to understand then a simple return x;. But sometimes that ugliness is necessary, and the best place to hide it is in a one-line getter method. It’s self documenting. And once you understand Cocoa memory management, it’s entirely clear what the method does. For me, the tiny readability cost is worth the safety guarantee.

Big

return [[x retain] autorelease]; increases peak memory pressure, because it can defer dealloc-ing unused objects until a few autorelease pools are drained. Honestly I’ve never measured memory usage, and found this to be a significant problem. It certainly could be, especially if the thing being returned is a large picture or chunk of data. But in my experience, it’s nothing to worry about for getters that return typical objects, unless there are measurements saying otherwise.

Slow

return [[x retain] autorelease]; is obviously slower then just return x;. But I doubt that optimizing an O(1) getter is going to make a significant difference to your application’s performance — especially compared to other things you could spend that time optimizing. So until I have data telling me otherwise, I don’t worry about adding an the extra method calls.

This is a Good Rule to Break

As I mentioned before, getters don’t need to worry about memory management. It could be argued that the return [[x retain] autorelease]; pattern is a premature optimization of theoretical safety at the expense of concrete performance.

Good programmers try to avoid premature optimization; so perhaps I’m wrong to follow this “safer” pattern. But until I have data showing otherwise, I like to do the safest thing.

How do you write getters, and why?

March 29, 2009

How To Write Cocoa Object Setters

There are several ways to write setters for Objective-C/Cocoa objects that work. But here are the practices I follow; to the best of my knowledge they produce the safest code.

Principle 0: Don’t Write a Setter

When possible, it’s best to write immutable objects. Generally they are safer, and easier to optimize, especially when it comes to concurrency.

By definition immutable objects have no setters, so always ask yourself if you really need a setter, before you write one, and whenever revisiting code.

I’ve removed many of my setters by making the thing they set an argument to the class’s -initWith: constructor. For example,

CustomWidget *widget = [[CustomWidget alloc] init];
[widget setController:self];

becomes,

CustomWidget *widget = [[CustomWidget alloc] initWithController:self];

This is less code, and now, widget is never in a partially-ready state with no controller.

It’s not always practical to do without setters. If an object looks like it needs a settable property, it probably does. But in my experience, questioning the assumption that a property needs to be changeable pays off consistently, if infrequently.

Principle 1: Use @synthesize

This should go without saying, but as long as I’m enumerating best-practices: if you are using Objective-C 2.0 (iPhone or Mac OS X 10.5 & up) you should use @synthesize-ed properties to implement your setters.

The obvious benefits are less code, and setters that are guaranteed to work by the compiler. A less obvious benefit is a clean, abstracted way to expose the state values of an object. Also, using properties can simplify you dealloc method.

But watch out for the a gotcha if you are using copy-assignment for an NSMutable object!

(Note: Some Cocoa programmers strongly dislike the dot-syntax that was introduced with properties and lets you say x.foo = 3; instead of [x setFoo:3];. But, you can use properties without using the dot-syntax. For the record, I think the dot syntax is an improvement. But don’t let a hatred of it it keep you from using properties.)

Principle 2: Prefer copy over retain

I covered this in detail here. In summary, use copy over retain whenever possible: copy is safer, and with most basic Foundation objects, copy is just as fast and efficient as retain.

The Preferred Pattern

When properties are unavailable, this is my “go-to” pattern:

- (void) setX:(TypeOfX*)newX;
{
  [memberVariableThatHoldsX autorelease];
  memberVariableThatHoldsX = [newX copy];
}

Sometimes I use use retain, or very rarely mutableCopy, instead of copy. But if autorelease won’t work, then I use a different pattern. I have a few reasons for writing setters this way.

Reason: Less Code

This pattern is only two lines of code, and has no conditionals. There is very little that can I can screw up when writing it. It always does the same thing, which simplifies debugging.

Reason: autorelease Defers Destruction

Using autorelease instead of release is just a little bit safer, because it does not immediately destroy the old value.

If the old value is immediately released in the setter then this code will sometimes crash,

NSString* oldName = [x name];
[x setName:@"Alice"];
NSLog(@"%@ has changed their name to Alice", oldName);

If -setName: immediately releasees the value that -name returned, oldName will be invalid when it’s used in NSLog.

But if If -setName: autorelease-ed the old value instead, this wouldn’t be a problem; oldName would still be valid until the current autorelease pool was drained.

Reason: Precedent

This is the pattern that google recommends.

When assigning a new object to a variable, one must first release the old object to avoid a memory leak. There are several “correct” ways to handle this. We’ve chosen the “autorelease then retain” approach because it’s less prone to error. Be aware in tight loops it can fill up the autorelease pool, and may be slightly less efficient, but we feel the tradeoffs are acceptable.

- (void)setFoo:(GMFoo *)aFoo {
  [foo_ autorelease];  // Won't dealloc if |foo_| == |aFoo|
  foo_ = [aFoo retain];
}

Backup Pattern (No autorelease)

When autorelease won’t work, my Plan-B is:

- (void) setX:(TypeOfX*)newX;
{
  id old = memberVariableThatHoldsX;
  memberVariableThatHoldsX = [newX copy];
  [old release];
}

Reason: Simple

Again, there are no conditionals in this pattern. There’s no if(oldX != newX) test for me to screw up. (Yes, I have done this. It wasn’t a hard bug to discover and fix, but it was a bug nonetheless.) When I’m debugging a problem, I know exactly what setX: did to it’s inputs, without having to know what they are.

On id old

I like naming my temporary old-value id old, because that name & type always works, and it’s short. It’s less to type, and less to think about than TypeOfX* oldX.

But I don’t think it’s necessarily the best choice for doing more to old than sending it release.

To be honest I’m still evaluating that naming practice. But so far I’ve been happy with it.

Principle 3: Only Optimize After You Measure

This is an old maxim of Computer Science, but it bears repeating.

The most common pattern for a setter feels like premature optimization:

- (void) setX:(TypeOfX*)newX;
{
  if(newX != memberVariableThatHoldsX){
    [memberVariableThatHoldsX release];
    memberVariableThatHoldsX = [newX copy];
  }
}

Testing if(newX != memberVariableThatHoldsX) can avoid an expensive copy.

But it also slows every call to setX:. if statements are more code, that takes time to execute. When the processor guesses wrong while loading instructions after the branch, if’s become quite expensive.

To know what way is faster, you have to measure real-world conditions. Even if a copy is very slow, the conditional approach isn’t necessarily faster, unless there is code that sets a property to it’s current value. Which is kind of silly really. How often do you write code like,

[a setX:x1];
[a setX:x1]; //just to be sure!

or

[a setX:[a x]];

Does that look like code you want to optimize? (Trick question! You don’t know until you test.)

Hypocrisy!

I constantly break Principle 3 by declaring properties in iPhone code as nonatomic, since it’s the pattern Apple uses in their libraries. I assume Apple has good reason for it; and since I will need to write synchronization-code to safely use their libraries, I figure it’s not much more work to reuse the same code to protect access to my objects.

I can’t shake the feeling I’m wrong to do this. But it seems more wrong to not follow Apple’s example; they wrote the iPhone OS in the first place.

If you know a better best practice, say so!

There isn’t a way to write a setter that works optimally all the time, but there is a setter-pattern that works optimally more often then other patterns. With your help I can find it.

UPDATE 03-30-2009:

Wil Shiply disagrees. Essentially his argument is that setters are called a lot, so if they aren’t aggressive about freeing memory, you can have thousands of dead objects rotting in an autorelease pool. Plus, setters often do things like registering with the undo manager, and that’s expensive, so it’s a good idea to have conditional code that only does that when necessary.

My rebuttal is that you should optimize big programs by draining autorelease pools early anyway; and that mitigates the dead-object problem.

With complex setters I can see why it makes sense to check if you need to do something before doing it. I still prefer safer, unconditional, code as a simple first implementation. That’s why it’s my go-to pattern. But if most setters you write end up being more complex, it might be the wrong pattern.

Really you should read what Wil says, and decide for yourself. He’s got much more experience with Objective-C development then I do.

January 7, 2009

Objective-C 1.0 Style: Don’t Name Your Enumerators “enumerator”!

Filed under: Bug Bite, C++, Cocoa, Design, Objective-C, Programming, Tips, Usability — Tags: , , , — Vincent Gable @ 7:36 pm

Disclaimer

There is a better way to iterate over collections in Objective-C 1.0. You really should use it. It’s easier to write, easier to read, less prone to bugs, faster, and makes what I’m going to rant about here a non-issue, because you won’t have any NSEnumerator variables in your code.

Badly Named Problem

The standard iteration idiom in Objective-C 1.0 is:

NSEnumerator *enumerator = [collection objectEnumerator];
id element = nil;
while( nil != (element = [enumerator nextObject]) ) {
   ;//do stuff...
}

Unfortunately, I see otherwise steller programmers name their NSEnumerator variables “enumerator”. That’s always wrong, because it does not tell you what the enumerator enumerates. We already know that enumerator enumerates things, because it’s type is NSEnumerator, unless it’s name tells us more then that it’s hardly better then no name at all.

This is an especially amateurish practice because …

Naming an Enumerator Well is Easy!

Call it, the plural of the element variable. And if that won’t work you can always fall back on calling it collectionEnumerator.

For example, to fix:

NSEnumerator *enumerator = [input objectEnumerator];
NSString *path = nil;
while (path = [enumerator nextObject])

We should name enumerator paths or inputEnumerator. You might find “paths” to be too close to “path” in which case let the “plural form” of element be everyElement, giving everyPath.

These rules can be applied without much thought, but will improve the clarity of code.

Why enumerator is Worse Than i

Firstly, the practice of naming an the index in a for-loop i is not good. You can avoid it by renaming i to thingThatIsIndexedIndex.

But at least, for(int i = 0; i < collection.size(); i++), is concise; therefore better than a equally-poorly-named NSEnumerator.

Also, there is something to be said for the idiom you can just use collection[i] over declaring an extra element variable.

The Right Choice

Everyone agrees informative names are good, yet poorly named enumerators are everywhere (just browse Apple’s sample code!) Poorly named enumerators persist because nobody really uses an enumerator per se, they are just part of an iteration idiom. (So stop declaring them and iterate smarter). When was the last time you saw code that did anything with an enumerator besides [enumerator nextObject] in a while loop?

But bad habits matter. Don’t pick up the habit of naming something poorly when it’s easy to do the right thing.

December 8, 2008

Optimize CPU Usage of Your Website

Filed under: Programming, Quotes, Usability — Tags: , , , , , — Vincent Gable @ 2:55 pm

The thing is, web developers should test their pages for CPU usage the same as app developers do. And anytime a page is idle, CPU usage should be at 0%. Same as with any other app.

Brent Simmons

Last quarter notebooks outsold desktops for the first time. Netbooks, and iPhones have been exploding in popularity as well. That means a significant number of your website’s visitors will be running off a battery, and since battery life decreases proportionally with CPU load, you really do owe it to your users to do this. The internet shouldn’t be something you have to be plugged into a power outlet to use.

December 4, 2008

Optimize People’s Time Not The Machine’s Time

Filed under: Quotes, Usability — Tags: , , — Vincent Gable @ 3:09 am

Bruce “Tog” Tognazzini gives the best example I’ve seen of optimizing a person’s time, not a machine’s time

People cost a lot more money than machines, and while it might appear that increasing machine productivity must result in increasing human productivity, the opposite is often true. In judging the efficiency of a system, look beyond just the efficiency of the machine.

For example, which of the following takes less time? Heating water in a microwave for one minute and ten seconds or heating it for one minute and eleven seconds?

From the standpoint of the microwave, one minute and ten seconds is the obviously correct answer. From the standpoint of the user of the microwave, one minute and eleven seconds is faster. Why? Because in the first case, the user must press the one key twice, then visually locate the zero key, move the finger into place over it, and press it once. In the second case, the user just presses the same key–the one key–three times. It typically takes more than one second to acquire the zero key. Hence, the water is heated faster when it is “cooked” longer.

Other factors beyond speed make the 111 solution more efficient. Seeking out a different key not only takes time, it requires a fairly high level of cognitive processing. While the processing is underway, the main task the user was involved with–cooking their meal–must be set aside. The longer it is set aside, the longer it will take to reacquire it.

You also need to consider actual performance vr perceived performance, when optimizing a user’s time.

Cocoa Coding Style Minutia: All Forward Declarations on One Line is the One True Way

This is a very petty point, but I believe there’s a right answer, and since it’s come up before I’m going to take the time to justify the best choice.

Forward Declare Classes on One Line, Not Many Lines

Right:
@class Foo, Bar, Baz...;

This is the style that Apple’s headers generally follow, look at NSDocument.h for example.

Wrong:
@class Foo;
@class Bar;
@class Baz;
...

Programming languages are to be read by people first, and interpreted by computers second. I really do hope anyone in software engineering can take that as an axiom!

Therefore, always favor brevity above all else with statements that are only for the compiler. They are by far the least important part of the code base. Keeping them terse minimizes distractions from code that can and should be read.

Naturally, it’s very hard to find a statement that’s only for the compiler. I can only think of one such statement in Objective-C, forward declarations of an object with @class.

What Does @class Do?

@class Foo; tells the compiler that Foo is an Objective-C class, so that it knows how much space to reserve in memory for things of type Foo. It does not tell the compiler what methods and ivars a Foo has, to do that you need to #import "Foo.h" (or have an @interface Foo... in your .m file.)

@class Foo; #import "Foo.h"
Foo is a black box. You can only use it as an argument. You can also send messages to Foo.

What is @class Good For?

Since #import "Foo.h" does everything @class Foo; does and more, why would you ever use @class? The short answer is less time wasted compiling.

Lets say you have a Controller class, and it has an ivar that’s a Foo. To get it to compile, you put #import "Foo.h" inside Controller.h. So far so good. The problem comes when Foo.h is changed. Now any file that has #import "Foo.h" in it must be recompiled. So Controller.h has to be recompiled. So any file that has #import "Controller.h" in it must also be recompiled, and so on. Many objects that don’t use Foo objects, but do talk to the Controller (or to something that talks to something that talks to the Controller!) have to be rebuilt as well. It’s likely that the entire project would end up being rebuilt. With even a moderately sized project, that means a lot of needless compile time!

One solution is to put a forward-declaration, @class Foo;, in Controller.h, and code>#import “Foo.h” in Controller.m, and the few files that actually do something to Foo objects. @class Foo; gives the compiler just enough information to build an object with a Foo* ivar, because it knows how much space that ivar will need in memory. And since only objects that need to talk to Foo objects directly have any dependency on Foo.h, changes to Foo can be tested quickly. The first time I tried this solution in one of my projects, compile times dropped from minutes to seconds.

Readers Need Less @class

Forward declarations add value to the development process, as does anything that saves programmers time. But @class tells a human reader nothing. Which is why I say you should use them in your code, but minimize the space they take up, by putting them all on one line.

@class Foo; tells a reader that "Foo is black-box that is a class." That adds no useful information.

If the reader does not care about what a Foo is (they are treating it as a black box), then knowing that Foo is a class is useless information. Foo could be a struct, or a bit-field, or a function pointer, or anything else and they still would not need to know to understand the code.

Conversely, if they need to know what a Foo is to understand the code, then they need to know more then "it is a class". They need to see Foo.h or documentation -- @class just isn't enough.

Exceptions

I want to again emphasize that this isn't a big deal. I strongly feel that putting all forward declarations on one line is the right answer, but doing it wrong won't perceptibly matter in the end.

So I don't think there's any reason to address exceptions to the rule. Do what you gotta do.

Best Practices

Use @class to speed up compile-times as much as you can, but use the keyword @class as little as possible, since it is for the compiler, not the person reading the code.

November 14, 2008

Prefer copy Over retain

Filed under: Bug Bite, Cocoa, Objective-C, Programming — Tags: , , — Vincent Gable @ 8:11 pm

(Almost) every time you use retain in Objective-C/Cocoa, you really should be using copy. Using retain can introduce some subtle bugs, and copy is faster then you think…

A Bug Waiting To Bite

The problem with using retain to “take ownership” of an object is that someone else has a pointer to the same object, and if they change it, you will be affected.

For example, let’s say you have a Person class with a straightforward setter method:
- (void) setThingsToCallTheBossToHisFace:(NSArray*)newNames {
   [thingsToCallTheBossToHisFace autorelease];
   thingsToCallTheBossToHisFace = [newNames retain];
}

And you use it to initialize a few Person objects:

NSMutableArray *appropriateNames = [NSMutableArray arrayWithObject:@"Mr. Smith"];
[anIntern setThingsToCallTheBossToHisFace:appropriateNames];

//Salaried Employees can also be a bit more informal
[appropriateNames addObject:@"Joe"];
[aSalariedEmployee setThingsToCallTheBossToHisFace:appropriateNames];

//the wife can also use terms of endearment
[appropriateNames addObject:@"Honey"];
[appropriateNames addObject:@"Darling"];
[theBossesWife setThingsToCallTheBossToHisFace:appropriateNames];


The code looks good, it compiles without error, and it has a bug in it. Because setThingsToCallTheBossToHisFace: uses retain, each Person object’s thingsToCallTheBossToHisFace field is actually pointing to the exact same NSMutableArray. So adding “darling” to the list of names the wife can use also adds it to the intern’s vocabulary.

If copy was used instead, then each Person would have their own separate list of names, insulated from changes to the temporary variable appropriateNames.

A Sneaky Bug Too

This is a particularly insidious problem in Foundation/Cocoa, because mutable objects are subclasses of immutable objects. This means every NSMutableThing is also a NSThing. So even if a method is declared to take an immutable object, if someone passes in a mutable object by accident, there will be no compile-time or run-time warnings.

Unfortunately, there isn’t a good way to enforce that a method takes an object, but not a subclass. Because Foundation makes heavy use of class clusters, it’s very difficult to figure out if you have an immutable class, or it’s mutable subclass. For example, with:
NSArray *immutableArray = [NSArray array];
NSMutableArray *mutableArray = [NSMutableArray array];

[immutableArray isKindOfClass:[NSArray class]] is YES
[immutableArray isKindOfClass:[NSMutableArray class]] is YES
[mutableArray isKindOfClass:[NSArray class]] is YES
[mutableArray isKindOfClass:[NSMutableArray class]] is YES
[mutableArray isKindOfClass:[immutableArray class]] is YES
[immutableArray isKindOfClass:[mutableArray class]] is YES

Sad, but true.

copy Is Fast!

With nearly every immutable Foundation object, copy and retain are the same thing — there is absolutely no penalty for using copy over retain! The only time you would take a performance hit using copy would be if the object actually was mutable. And then you really do want to copy it, to avoid bugs!

The only exceptions I know of are: NSDate, and NSAttributedString.

But don’t just take my word for it! Here’s the snippet of code I used to test all this:

NSMutableArray *objects = [NSMutableArray array];
//add anything that can be made with alloc/init
NSArray *classNames = [NSArray arrayWithObjects:@"NSArray", @"NSColor", @"NSData", @"NSDictionary", @"NSSet", @"NSString", nil];
for(NSString *className in classNames) {
   id obj = [[NSClassFromString(className) alloc] init];
   if(obj)
      [objects addObject:obj];
   else
      NSLog(@"WARNING: Could not instatiate an object of class %@", className);
}

//manually add objects that must be created in a unique way
[objects addObject:[[NSAttributedString alloc] initWithString:@""]];
[objects addObject:[NSDate date]];
[objects addObject:[NSNumber numberWithInt:0]];
[objects addObject:[NSValue valueWithSize:NSZeroSize]];

//test if retain and copy do the same thing
for(id obj in objects)
   if(obj != [obj copy])
      NSLog(@"copy and retain are not equvalent for %@ objects", [obj className]);

Best Practices

Get in the habit of using copy, anytime you need to set or initWith something. In general, copy is safer then retain, so always prefer it.

I believe it is best to try copy first. If an object can not be copied, you will find out about it the first time your code is executed. It will be trivial to substitute retain for copy. But it is much harder, and takes much longer, to discover that you should have been using copy instead of retain.

A program must be correct before it can be made to run faster. And we have seen there is no performance penalty for copy on most common objects. So it makes sense to try copy first, and then replace it with retain if it proves to be necessary through measurement. You will be measuring before you start “optimizing”, right? (I also suspect that if taking ownership of an object is a bottle-neck, then the right optimization is not to switch to retain, but to find a way to use a mutable object, or an object pool, to avoid the “take ownership” step altogether.)

Choose copy, unless you have a measurable justification for using retain.

UPDATE 2009-11-10: Obj-C 2.0 blocks have some peculiarities,

For this reason, if you need to return a block from a function or method, you must [[block copy] autorelease] it, not simply [[block retain] autorelease] it.

Powered by WordPress