Need some help with your project? Contact me

Drupal update hooks with multiple passes

If you do a lot of Drupal development and need to deploy configuration I am sure that you are using update hooks to some extent at least. If you don't use Features and want to create a taxonomy vocabulary or something in code, the hook_update_N() hook is the way to go.

But have you ever needed to perform an update the size of which would exceed PHP's maximum execution time? If you need to create 1000 entities (let's just say as an example), it's not a good idea to trust that the production server will not max out and leave you hanging in the middle of a deploy. So what's the solution?

You can use the batch capability of the update hook. If you were wondering what the &$sandbox argument is for, it's just for that. You use it for two things mainly:

  • store data required for your operations across multiple passes (since it is passed by reference the values remain)
  • tell Drupal when it should stop the process by setting the $sandbox['#finished'] value to 1.

Let me show you how this works. Let's say we want to create a vocabulary and a bunch of taxonomy terms with names from a big array. We want to break this array into chunks and create the terms one chunk at the time so as to avoid the load on the server.

So here is how you do it:

/**
 * Create all the terms
 */
function my_module_update_7001(&$sandbox) {

  $names = array(
    'Fiona',
    'Jesse',
    'Michael',
    ...
    'Sam',
    'Nate',
  );

  if (!isset($sandbox['progress'])) {
    $sandbox['progress'] = 0;
    $sandbox['limit'] = 5;
    $sandbox['max'] = count($names);

    // Create the vocabulary
    $vocab = (object) array(
      'name' => 'Names',
      'description' => 'My name vocabulary.',
      'machine_name' => 'names_vocabulary',
    );

    taxonomy_vocabulary_save($vocab);
    $sandbox['vocab'] = taxonomy_vocabulary_machine_name_load('names_vocabulary');
  }

  // Create the terms
  $chunk = array_slice($names, $sandbox['progress'], $sandbox['limit']);
  if (!empty($chunk)) {
    foreach ($chunk as $key => $name) {
      $term = (object) array(
        'name' => $name,
        'description' => 'The name is: ' . $name,
        'vid' => $sandbox['vocab']->vid,
      );

      taxonomy_term_save($term);
      $sandbox['progress']++;
    }
  }

  $sandbox['#finished'] = ($sandbox['progress'] / $sandbox['max']);
}

So what happens here? First, we are dealing with an array of names (can anybody recognise them by the way?) Then we basically see if we are at the first pass by checking if we had set already the progress key in $sandbox. If we are at the first pass, we set some defaults: a limit of 5 terms per pass out of a total of count($names). Additionally, we create the vocabulary and store it as a loaded object in the sandbox as well (because we need its id for creating the terms).

Then, regardless of the pass we are on, we take a chunk out of the names always offset by the progress of the operation. And with each term created, we increment this progress by one (so with each chunk, the progress increases by 5) and of course create the terms. At the very end, we keep setting the value of $sandbox['#finished'] to the ratio of progress per total. Meaning that with each pass, this value increases from an original of 0 to a maximum of 1 (at which point Drupal knows it needs to stop calling the hook).

And like this, we save a bunch of terms without worrying that PHP will time out or the server will be overloaded. Drupal will keep calling the hook as many times as needed. And depending on the operation, you can set your own sensible chunk sizes.

Hope this helps.

Add new comment

You can post comments in Markdown and basic HTML tags.
For code blocks, wrap your code within '~~~'. For example:
~~~
$var = 'my variable';
~~~