Vincent Gable’s Blog

October 20, 2009

JavaScript Nailed ||

One thing about JavaScript I really like is that its ||, the Logical Or operator, is really a more general ‘Eval Until True‘ operation. (If you have a better name for this operation, please leave a comment!) It’s the same kind of or operator used in Lisp. And I believe it’s the best choice for a language to use.

In C/C++, a || b is equivalent to,

  if a evaluates to a non-zero value:
    return true;
  if b evaluates to a non-zero value:
    return true;
  otherwise:
    return false;

Note that if a can be converted to true, then b is not evaluated. Importantly, in C/C++ || always returns a bool.

But the JavaScript || returns the value of the first variable that can be converted to true, or the last variable if both variables can’t be interpreted as true,

  if a evaluates to a non-zero value:
    return a;
  otherwise:
    return b;

Concise

JavaScript’s || is some sweet syntactic sugar.

We can write,

return playerName || "Player 1";

instead of,

return playerName ? playerName : "Player 1";

And simplify assert-like code in a perl-esq way,

x || throw "x was unexpectedly null!";

It’s interesting that a more concise definition of || allows more concise code, even though intuitively we’d expect a more complex || to “do more work for us”.

General

Defining || to return values, not true/false, is much more useful for functional programming.

The short-circuit-evaluation is powerful enough to replace if-statements. For example, the familiar factorial function,

function factorial(n){
	if(n == 0) return 1;
	return n*factorial(n-1);
}

can be written in JavaScript using && and || expressions,

function factorial2(n){ return n * (n && factorial2(n-1)) || 1;}

Yes, I know this isn’t the clearest way to write a factorial, and it would still be an expression if it used ?:, but hopefully this gives you a sense of what short-circuiting operations can do.

Unlike ?:, the two-argument || intuitively generalizes to n arguments, equivalent to a1 || a2 || ... || an. This makes it even more useful for dealing with abstractions.

Logical operators that return values, instead of simply booleans, are more expressive and powerful, although at first they may not seem useful — especially coming from a language without them.

September 18, 2009

Strange AOL Instant Message Filtering

Filed under: Announcement,Bug Bite,Security | , , , ,
― Vincent Gable on September 18, 2009

You can’t send a message over AIM that has a JavaScript event handler name, followed by = in it. The message seems to be blocked on the server, not in the client, as this behavior was observed in different AIM clients (iChat, Adium, and meebo.)

Examples

The following messages can’t be sent over AIM:

onclick=

onclick =

Yo dawg, I heard you liked onclick= in your JavaScript…

Interestingly, using a newline, instead of space, between the handler name and = allows the message to be sent, even though it is still valid HTML/JavaScript. For example, you can send,

onclick
=x();
/*this is fine*/

I suspect there is an interesting security story behind all of this. If you know how and why this filtering came to pass, I please leave a comment.

Thanks to Dustin Silverman for helping me investigate this. In case you were wondering how I stumbled onto this behavior — I was sending snippets of HTML from twitterglyphs.com/ over AIM.

September 11, 2009

Never Start An Integer With 0

When programming, never start an integer with 0. Most programming languages treat a decimal number that starts with 0 as octal (base-8). So x = 013; does not set x to 13. Instead x is 11, because 013 is interpreted as 138 not 1310.

Languages with this quirk include: C, C++, Objective-C, Java, JavaScript, Perl, Python 3.0, and Ruby. If you add up the “market share” of these languages, it comes out to above 50%, which is why I say most languages.

“But I use {Smalltalk, Haskell, Lisp, etc.}”

I’m jealous that you get to use such a nice language. However, it’s bad programming hygiene to pick up habits that are dangerous in common languages.

Now, I assume you wouldn’t write 7 as 007, unless the leading zero(s) carried some extra meaning. There are cases where this clarity outweighs “cleanliness” (unless the code meant to be ported to a C-like language).

But you should at least be aware of this inter-lingual gotcha.

May 1, 2009

NSXMLParser and HTML/XHTML

Filed under: Bug Bite,iPhone,Objective-C,Programming | , , , ,
― Vincent Gable on May 1, 2009

NSXMLParser converts HTML/XML-entities in the string it gives the delegate callback -(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string. So if an XML file contains the string, "&lt; or &gt;", the converted string "< or >" would be reported to the delegate, not the string that you would see if you opened the file with TextEdit.

This is correct behavior for XML files, but it can cause problems if you are trying to use an NSXMLParser to monkey with XHTML/HTML.

I was using an NSXMLParser to modify an XHTML webpage from Simple Wikipedia, and it was turning: “#include &lt;stdio&gt;” into “#include <stdio>“, which then displayed as “#include “, because WebKit thought <stdio> was a tag.

Solution: Better Tools

For scraping/reading a webpage, XPath is the best choice. It is faster and less memory intensive then NSXMLParser, and very concise. My experience with it has been positive.

For modifying a webpage, JavaScript might be a better fit then Objective-C. You can use
- (NSString *)stringByEvaluatingJavaScriptFromString:(NSString *)script to execute JavaScript inside a UIWebView in any Cocoa program. Neat stuff!

My Unsatisfying Solution

Do not use this, see why below:

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string;
{
	string = [string stringByReplacingOccurrencesOfString:@"<" withString:@"&lt;"];
	string = [string stringByReplacingOccurrencesOfString:@">" withString:@"&gt;"];

	/* ... rest of the method */
}

Frankly that code scares me. I worry I’m not escaping something I should be. Experience has taught me I don’t have the experience of the teams who wrote HTML libraries, so it’s dangerous to try and recreate their work.

(UPDATED 2009-05-26: And indeed, I screwed up. I was replacing & with &amp;, and that was causing trouble. While my “fix” of not converting & seems to work on one website, it will not in general.)

I would like to experiment with using JavaScript instead of an NSXMLParser, but at the moment I have a working (and surprisingly compact) NSXMLParser implementation, and much less familiarity with JavaScript then Objective-C. And compiled Obj-C code should be more performant then JavaScript. So I’m sticking with what I have, at least until I’ve gotten Prometheus 1.0 out the door.

April 22, 2009

Getting the Current URL from a WebView

Filed under: Cocoa,iPhone,MacOSX,Objective-C,Programming,Sample Code | , , ,
― Vincent Gable on April 22, 2009

UPDATED 2009-04-30: WARNING: this method will not always give the correct result. +[NSURL URLWithString:] requires it’s argument to have unicode characters %-escaped UTF8. But stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding will convert # to %23, so http://example.com/index.html#s1 would become http://example.com/index.html%23s1. Unfortunately, the two URLs are not equivalent. The un-%-escaped one refers to section #s1 in the file index.html, and the other tries to fetch the file index.html#s1 (“index dot html#s1”). I have not yet implemented a workaround, although I suspect one is possible, by building the NSURL out of bits of the JavaScript location object, rather then trying to convert the whole string.


UIWebView/WebView does not provide a way to find the URL of the webpage it is showing. But there’s a simple (and neat) way to get it using embedded JavaScript.

- (NSString *)stringByEvaluatingJavaScriptFromString:(NSString *)script

Is a deceptively powerful method that can execute dynamically constructed JavaScript, and lets you embed JavaScript snippets in Cocoa programs. We can use it to embed one line of JavaScript to ask a UIWebView for the URL it’s showing.

@interface UIWebView (CurrentURLInfo)
- (NSURL*) locationURL;
@end
@implementation UIWebView (CurrentURLInfo) - (NSURL*) locationURL; { NSString *rawLocationString = [self stringByEvaluatingJavaScriptFromString:@"location.href;"]; if(!rawLocationString) return nil; //URLWithString: needs percent escapes added or it will fail with, eg. a file:// URL with spaces or any URL with unicode. locationString = [locationString stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding]; return [NSURL URLWithString:locationString] } @end

With the CurrentURLInfo category, you can do aWebView.locationURL to get the URL of the page a WebView is showing.


One last note, the

if(!rawLocationString)
	return nil;

special-case was only necessary, because [NSURL URLWithString:nil] throws an exception (rdar://6810626). But Apple has decided that this is correct behavior.

Powered by WordPress