As you work on custom modules you sometimes need to apply schema or content updates as part of your module development.

These changes may be schema changes (new or amended tables) or content changes.

So what?

Usually small quick updates are no problem, however big slow updates can be a right pain!

Recently I was involved in populating a new content type (on a D6 site) from an existing one (see blog post Drupal Node Update) this involved using node load/submit/save to create the new content and whilst this worked, the hook update that I had stuck it in kept timing out.

The solution

Much soul (& Google) searching didn't reveal a solution until I finally found the Drupal posts function hook_update_N and batch API

In essence the solution was to use the (largely undocumented) pass by reference parameter to hook_update_N to persist data between multiple runs of the hook (done automatically) and use the return array element '#finished' to indicate that the update hook wasn't yet finished.

Sample code

As always this sort of thing is best illustrated by code, so here it is:

  function mymodule_update_6001(&$sandbox) {
    $ret = array();
    $sql = "SELECT nid FROM {node} WHERE type IN (...)";  
    // if this is the first pass through this update function then set some variables.
    if (!isset($sandbox['max'])) {
      $qry                 = db_query($sql);
      $sandbox['progress'] = 0;
      $sandbox['max']      = $qry->num_rows;
    // process next 50 - could also use db_query_range
    $inc = 50;
    $sql = $sql . ' ORDER BY 1 LIMIT ' . $sandbox['progress'] . ',' . $inc;
    if ($qry = db_query($sql)){
      while ($node = db_fetch_object($qry)) {
        $node  = node_load($node->nid);
        $node->nid  = 0;
        $node->vid  = null;
        $node->type = 'my new type';
        $node = node_submit($node);
        // assume it works!
        drupal_set_message('New node '.$node->nid);
    $sandbox['progress'] = $sandbox['progress'] + $inc;
    // set the value for finished
    // if current == total then finished will be 1, signifying we are done!
    $ret['#finished'] = ($sandbox['progress'] >= $sandbox['max']) ? 1 : ($sandbox['progress'] / $sandbox['max']);
    if ($ret['#finished'] === 1) {
      drupal_set_message('Processed nodes - '.$sandbox['max']);
    return $ret;

Notice the use of the $sandbox pass by reference parameter to persist data between calls of hook update and the use of the return array element $ret['#finished'] having a value between 0 and 1 indicating the progress through the update (and a sort of accurate-ish progress bar).