Writing Better JavaScript – split up long methods

October 6, 2009 by Ed Spencer · Leave a Comment 

For the second time this week I’m going to pick on the usually delightful Ext JS library. Last time we discussed the overzealous use of the Module pattern; this time it’s the turn of bloated methods.

As before, I’m not really picking on Ext at all – this happens all over the place. But again, this is the library closest to my heart and the one I know the best.

The Problem

We’re going to take a look at Ext.data.XmlReader’s readRecords method. Before we get started though, I’ll repeat that this is intended as an example of an approach, not a whine at Ext in particular.

/**
 * Create a data block containing Ext.data.Records from an XML document.
 * @param {Object} doc A parsed XML document.
 * @return {Object} records A data block which is used by an {@link Ext.data.Store} as
 * a cache of Ext.data.Records.
 */
readRecords: function(doc) {
  /**
   * After any data loads/reads, the raw XML Document is available for further custom processing.
   * @type XMLDocument
   */
  this.xmlData = doc;
  var root = doc.documentElement || doc;
  var q = Ext.DomQuery;
  var recordType = this.recordType, fields = recordType.prototype.fields;
  var sid = this.meta.idPath || this.meta.id;
  var totalRecords = 0, success = true;
  if(this.meta.totalRecords){
    totalRecords = q.selectNumber(this.meta.totalRecords, root, 0);
  }

  if(this.meta.success){
    var sv = q.selectValue(this.meta.success, root, true);
    success = sv !== false && sv !== 'false';
  }
  var records = [];
  var ns = q.select(this.meta.record, root);
  for(var i = 0, len = ns.length; i < len; i++) {
    var n = ns[i];
    var values = {};
    var id = sid ? q.selectValue(sid, n) : undefined;
    for(var j = 0, jlen = fields.length; j < jlen; j++){
      var f = fields.items[j];
      var v = q.selectValue(Ext.value(f.mapping, f.name, true), n, f.defaultValue);
      v = f.convert(v, n);
      values[f.name] = v;
    }
    var record = new recordType(values, id);
    record.node = n;
    records[records.length] = record;
  }

  return {
    success : success,
    records : records,
    totalRecords : totalRecords || records.length
  };
}

Anyone care to tell me what this actually does? Personally, I have absolutely no idea. I recently found myself needing to implement an XmlReader subclass with a twist which required understanding how this works, and let’s just say it wasn’t easy!

So what is it that makes the above so terrifyingly hard to understand? Well, in no particular order:

  • It’s too long – you’d need to be a genius to easily understand what’s going on here
  • The variable names don’t make much sense – some of the oddest include ‘q’, ‘ns’, ‘v’, ‘f’ and ’sv’
  • There’s minimal commenting – we’re given a single-line clue at the very top as to what these 40-odd lines do

A Solution

Let’s see how the reworked code below addresses each of the concerns above:

  • Although we end up with more lines of code here, no single method is more than around 10 LOC
  • No single letter variable names – you no longer have to decode what ’sv’ means
  • Constructive commenting allows rapid comprehension by skimming the text

One additional and enormous benefit here comes directly from splitting logic into discrete methods. Previously if you’d wanted to implement your own logic to determine success, get the total number of records or even build a record from an XML node you’d be stuck. There was no way to selectively override that logic without redefining that entire monster method.

With our new approach this becomes trivial:

Ext.extend(Ext.data.XmlReader, Ext.data.DataReader, {
  readRecords: function(doc) {
    this.xmlData = doc;

    //get local references to frequently used variables
    var root    = doc.documentElement || doc,
        records = [],
        nodes   = Ext.DomQuery.select(this.meta.record, root);

    //build an Ext.data.Record instance for each node
    Ext.each(nodes, function(node) {
      records.push(this.buildRecordForNode(node));
    }, this);

    return {
      records     : records,
      success     : this.wasSuccessful(root),
      totalRecords: this.getTotalRecords(root) || records.length
    };
  },

  /**
   * Returns a new Ext.data.Record instance using data from a given XML node
   * @param {Element} node The XML node to extract Record values from
   * @return {Ext.data.Record} The record instance
   */
  buildRecordForNode: function(node) {
    var domQuery = Ext.DomQuery,
        idPath   = this.meta.idPath || this.meta.id,
        id       = idPath ? domQuery.selectValue(idPath, node) : undefined;

    var record  = new this.recordType({}, id);
    record.node = node;

    //iterate over each field in our record, find it in the XML node and convert it
    record.fields.each(function(field) {
      var mapping  = Ext.value(field.mapping, field.name, true),
          rawValue = domQuery.selectValue(mapping, node, field.defaultValue),
          value    = field.convert(rawValue, node);

      record.set(field.name, value);
    });

    return record;
  },

  /**
   * Returns the total number of records indicated by the server response
   * @param {XMLDocument} root The XML response root node
   * @return {Number} total records
   */
  getTotalRecords: function(root) {
    var metaTotal = this.meta.totalRecords;

    return metaTotal == undefined
                      ? 0
                      : Ext.DomQuery.selectNumber(metaTotal, root, 0);
  },

  /**
   * Returns true if the response document includes the expected success property
   * @param {XMLDocument} root The XML document root node
   * @return {Boolean} True if the XML response was successful
   */
  wasSuccessful: function(root) {
    var metaSuccess  = this.meta.success;

    //return true for any response except 'false'
    if (metaSuccess == undefined) {
      return true;
    } else {
      var successValue = Ext.DomQuery.selectValue(metaSuccess, root, true);
      return successValue !== false && successValue !== 'false';
    }
  }
});

(For brevity I have omitted the existing readRecords comment blocks from the above)

I suggest that you structure your code in this way at least 99% of the time. The one exception is if high performance is an issue. If you are in a situation where every millisecond counts (you probably aren’t), then taking the former route becomes more acceptable (though there’s still no excuse for not adding a few comments explaining what the code actually does).

My refactored code almost certainly runs slower than the original as it doesn’t take as much advantage of cached local variables as the monolithic version does. For library-level code this can make sense if the performance gain is significant, but for the everyday code you and I write it is rarely a good idea.

I’ll be watching.

JavaScript Module pattern – overused, dangerous and bloody annoying

October 5, 2009 by Ed Spencer · 18 Comments 

The Module Pattern is a way of using a closure in JavaScript to create private variables and functions. Here’s a brief recap:

var myObject = (function() {
  //these are only accessible internally
  var privateVar = 'this is private';
  var privateFunction = function() {
    return "this is also private";
  };

  return {
    //these can be accessed externally
    publicVar: 'this is public',

    publicFunction: function() {
      return "this is also public"
    },

    //this is a 'privileged' function - it can access the internal private vars
    myFunction: function() {
      return privateVar;
    }
  };
})();

myObject.privateVar; //returns null as private var is private
myObject.myFunction(); //return the private var as myFunction has access to private properties

Breaking this down, we create a function which is executed immediately (via the brackets at the end) and returns an object which gets assigned to myObject.

Because this object contains references to our private variable (privateVar is referenced inside myFunction), the JavaScript engine keeps privateVar available in memory which means myFunction can still access it using what is called a closure. This pattern as a whole is usually called the Module Pattern.

Why it’s bad

On the face of it, private variables sound like a good thing. We have them in other languages after all, so why not in JavaScript too?

The reason that you shouldn’t use the Module pattern 90% of the time you think you should is that it entirely negates the dynamic nature of the language. If a class does 99% of what you want and you (rightly) don’t want to directly modify the source code, you will be thwarted every time if the class uses this pattern.

Example

I’ll share a recent example of this using a class in the Ext JS library. Ext is by no means the only library guilty of this, but it’s the one I use on a daily basis, and this is not the only example of this problem in the library.

The Ext.DomQuery object is a helper which allows us to parse XML documents locally. Unfortunately, it suffers from a limitation which causes the text content of an XML node to be truncated if it is over a certain size limit (just 4kb in Firefox, though this differs by browser). This isn’t actually a problem of Ext’s making, though it can solve it using just 1 line of code.

Ideally, we’d just be able to do this:

Ext.apply(Ext.DomQuery, {
  selectValue : function(path, root, defaultValue){
    path = path.replace(trimRe, "");
    if(!valueCache[path]) valueCache[path] = Ext.DomQuery.compile(path, "select");

    var n = valueCache[path](root), v;
    n = n[0] ? n[0] : n;

    //this line is the only change
    if (typeof n.normalize == 'function') n.normalize();

    v = (n && n.firstChild ? n.firstChild.nodeValue : null);
    return ((v === null||v === undefined||v==='') ? defaultValue : v);
  }
});

All we’re doing in the above is making a call to ‘normalize’ – a single line change which fixes the 4kb node text limitation. Sadly though, we can’t actually do this because of the use of the Module pattern. In this example there are two private variables being accessed – ‘trimRe’ and ‘valueCache’.

We can’t get access to these private variables in our override, which means that our override here fails. In fact, the Module pattern means we can’t actually patch this at all.

The only way to do it is to modify the source code of Ext JS itself, which is a very dangerous practice as you need to remember every change you made to ext-all.js and copy them all over next time you upgrade.

Even if there are good reasons for enforcing the privacy of variables (in this case I don’t think there are), we could get around this by providing a privileged function which returns the private variable – essentially making it read-only:

Ext.DomQuery.getValueCache = function() {
  return valueCache;
};

Except again this needs to be defined inside the original closure – we just can’t add it later. Again we would have to modify the original source code, with all the problems that entails.

Ext.ComponentMgr does the same trick when registering xtypes. An xtype is just a string that Ext maps to a constructor to allow for easy lazy instantiation. The trouble is, Ext hides the xtype lookup object inside a private variable, meaning that if you have an xtype string it is impossible to get a reference to the constructor function for that xtype. Ext provides a function to instantiate an object using that constructor, but doesn’t let you get at the constructor itself. This is totally unnecessary.

Recommendations

  1. Think very carefully before using the Module pattern at all. Do you really need to enforce privacy of your variables? If so, why?
  2. If you absolutely have to use private variables, consider providing a getter function which provides read-only access to the variables
  3. Keep in mind that once defined, private variables defined this way cannot be overwritten at all. In other languages you can often overwrite a superclass’s private variables in a subclass – here you can’t

Either of the above would have solved both problems, but as neither was implemented we have to fall back to hackery.

And remember this about the Module pattern:

  • It’s overused – in the examples above (especially Ext.ComponentMgr) there is no benefit from employing the pattern
  • It’s dangerous – because of its inflexibility it forces us to modify external source code directly – changes you are almost guaranteed to forget about when it comes to updating the library in the future
  • It’s bloody annoying – because of both of the above.

JavaScript FizzBuzz in a tweet

September 17, 2009 by Ed Spencer · 18 Comments 

The FizzBuzz challenge has been around a while but I stumbled across it again after reading another unique Giles Bowkett post.

If you’re not familiar with FizzBuzz, it’s a little ‘challenge’ designed to test a candidate programmer’s ability to perform a simple task. In this case, you just have to print out the numbers from 1 to 100, unless the number is a multiple of 3, when you should instead print “Fizz”, 5 in which case you print “Buzz”, or both 3 and 5 in which case you print “FizzBuzz”.

Here’s a trivial JavaScript implementation:

for (var i=1; i <= 100; i++) {
  if (i % 3 == 0) {
    if (i % 5 == 0) {
      console.log('FizzBuzz');
    } else {
     console.log('Fizz');
   }
  } else if (i % 5 == 0) {
    console.log('Buzz');
  } else {
    console.log(i);
  }
};

Pretty simple stuff, but a bit verbose. I wanted something that would fit into a tweet. It turns out that’s pretty simple – this is 133 characters including whitespace, 7 within tolerance for a twitter message:

for (var i = 1; i <= 100; i++) {
  var f = i % 3 == 0, b = i % 5 == 0;
  console.log(f ? b ? "FizzBuzz" : "Fizz" : b ? "Buzz" : i);
}

Which of course begs the question – just how short can a JavaScript FizzBuzz implementation be? Here’s my baseline, which is a tortured and contorted version of the above:

for(i=1;i<101;i++){console.log(i%3?i%5?i:"Buzz":i%5?"Fizz":"FizzBuzz")}

The above is 71 characters – I expect you to do better. The rules are that the only dependency is Firebug’s console.log being available, and you can’t replace ‘console.log’ for anything else.

Of course, if we did swap ‘console.log’ for ‘alert’, the whole thing would fit in a tweet twice, but that would be damn annoying.

Hint: you can take at least three more characters off the above – can you see how?

The trouble with new

September 8, 2009 by Ed Spencer · Leave a Comment 

We have a simple JavaScript class:

function User(firstName, lastName) {
  this.name = firstName + " " + lastName;
}

We create a new User:

var ed = new User("Ed", "Spencer");
alert(ed.name); //alerts 'Ed Spencer'
alert(window.name); //undefined

All is well. Unless we forgot the ‘new’:

var ed = User("Ed", "Spencer");
alert(ed.name); //ed is undefined
alert(window.name); //alerts 'Ed Spencer'

Curses! That’s not what we want at all. By omitting the ‘new’ keyword, the JavaScript engine executes our ‘User’ constructor in the current scope, which in this case is the global window object. With the scope (’this’) set to window, setting ‘this.name’ is now the same as setting ‘window.name’, which is not what we’re trying to do.

Here’s the problem though, omitting the ‘new’ keyword is still perfectly valid syntax. We know at design time if ‘new’ must be used or not, and can use a little trick to make it act as though ‘new’ was indeed used:

function User(firstName, lastName) {
  if (!(this instanceof User)) {
    return new User(firstName, lastName);
  }

  this.name = firstName + " " + lastName;
}

Because the ‘new’ keyword sets up a new context, we can just test to see if ‘this’ is now an instance of our class. If it’s not, it means the user has omitted the ‘new’ keyword, so we do it for them. John Resig has an example of this over on his blog.

This is all very well and good, but I don’t think we should use it. The reason is that we’re hiding a pseudo syntax error from the developer, instead of educating them with its correct usage. If we hide this mistake in each class we write, our unknowing developer will remain unknowing, and run into a wall when they repeat their mistake on classes that don’t fix it for them.

Instead, I suggest the following:

function User(firstName, lastName) {
  if (!(this instanceof User)) {
    throw new Error("You must use the 'new' keyword to instantiate a new User");
  }

  this.name = firstName + " " + lastName;
}

The only difference of course is that we’re throwing an Error instead of fixing the developer’s mistake. The benefit is that their syntax won’t actually work unless they write it correctly. This is good because our erstwhile developer is prompted to fix their code and understand why it was wrong. Better informed developers leads to better code.

Well, hopefully.

Read my BDD article in this month’s JS Magazine

June 10, 2009 by Ed Spencer · 7 Comments 

I have an article on Behaviour Driven Development for JavaScript in June’s edition of the excellent JavaScript Magazine.

If you haven’t seen or read the magazine before (it’s quite new), it’s well worth the few dollars charged. The magazine format allows for in-depth articles that require more space, time and effort to write than a typical blog post, and which therefore often go unwritten.

The thrust of my article is that too much of our JavaScript goes untested, but that nowadays it’s easy to fix that. I go through an example of a client side shopping cart, using the JSpec BDD library. Even if you don’t buy/read the magazine, I highly recommend checking out JSpec and other libraries like it. As JavaScript powered applications become the norm, BDD will only become more important in ensuring our applications work properly, so now is a good time to start.

Also in this month’s issue is a guide to using the Canvas tag, tips on how to use build scripts to optimise your JavaScript for each environment, AJAX security pointers and a roundup of community news.

« Previous PageNext Page »