Need some help with your project? Contact me

Dynamic migrations using "templates" in Drupal 8

This article is a companion to the presentation I held at the Drupal Dev Days 2019 conference in Cluj Napoca.

In this article we are going to explore some of the powers of the Drupal 8 migration system, namely the migration “templates” that allow us to build dynamic migrations. And by templates I don’t mean Twig templates but plugin definitions that get enhanced by a deriver to make individual migrations for each of the things that we need in the application. For example, as we will explore, each language.

The term “template” I inherit from the early days of Drupal 8 when migrations were config entities and core had migration (config) templates in place for Drupal to Drupal migrations. But I like to use this term to represent also the deriver-based migrations because it kinda makes sense. It’s a personal choice so feel free to ignore it if you don’t agree.

Before going into the details of how the dynamic migrations works, let’s cover a few of the more basic things about migrations in Drupal 8.

What is a migration?

The very first thing we should talk about is what actually is a migration. The simple answer to this question is: a plugin. Each migration is a YAML-based plugin that actually brings together all the other plugins the migration system needs to run an actual logical migration. And if you don’t know what a plugin is, they are swappable bits of functionality that are meant to perform a similar task, depending on their type. They are all over core and by now there are plenty of resources to read more about the plugin system, so I won’t go into it here.

Migration plugins, unlike most others such as blocks and field types, are defined in YAML files inside the module’s migrations folder. But just like all other plugin types, they map to a plugin class, in this case Drupal\migrate\Plugin\Migration.

The more important thing to know about migrations, however, is the logical structure they follow. And by this I mean that each migration is made up of a source, multiple processors and a destination. Make sense right? You need to get some data (the source reads and interprets its format), prepare it for its new destination (the processors alter or transform the data) and finally save it in the destination (which has a specific format and behaviour). And to make all this happen, we have plugins again:

  • Source plugins
  • Process plugins
  • Destination plugins

Source plugins are responsible for reading and iterating over the raw data being imported. And this can be in many formats: SQL tables, CSV files, JSON files, URL endpoint, etc. And for each of these we have a Drupal\migrate\Plugin\MigrateSourceInterface plugin. For average migrations, you’ll probably pick an existing source plugin, point it to your data and you are good to go. You can of course create your own if needed.

Destination plugins (Drupal\migrate\Plugin\MigrateDestinationInterface) are closely tied to the site being migrated into. And since we are in Drupal 8, these relate to what we can migrate to: entities, config, things like this. You will very rarely have to implement your own, and typically you will use an entity based destination.

In between these two, we have the process plugins (Drupal\migrate\Plugin\MigrateProcessInterface), which are admittedly the most fun. There are many of them already available in core and contrib, and their role is to take data values and prepare them for the destination. And the cool thing is that they are chainable so you can really get creative with your data. We will see in a bit how these are used in practice.

The migration plugin is therefore a basic definition of how these other 3 kinds of plugins should be used. You get some meta, source, process, destination and dependency information and you are good to go. But how?

That’s where the last main bit comes into play: the Drupal\migrate\MigrateExecutable. This guy is responsible for taking a migration plugin and “running” it. Meaning that it can make it import the data or roll it back. And some other adjacent things that have to do with this process.

Migrate ecosystem

Apart from the Drupal core setup, there are few notable contrib modules that any site doing migrations will/should use.

One of these is Migrate Plus. This module provides some additional helpful process plugins, the migration group configuration entity type for grouping migrations and a URL-based source plugin which comes with a couple of its own plugin types: Drupal\migrate_plus\DataFetcherPluginInterface (retrieve the data from a given protocol like a URL or file) and Drupal\migrate_plus\DataParserPluginInterface (interpret the retrieved data in various formats like JSON, XML, SOAP, etc). Really powerful stuff over here.

Another one is Migrate Tools. This one essentially provides the Drush commands for running the migrations. To do so, it provides its own migration executable that extends the core one to add all the necessary goodies. So in this respect, it’s a critical module if you wanna actually run migrations. It also makes an attempt at providing a UI but I guess more of that will come in the future.

The last one I will mention is Migrate Source CSV. This one provides a source plugin for CSV files. CSV is quite a popular data source format for migrations so you might end up using this quite a lot.

Going forward we will use all 3 of these modules.

Basic migration

After this admittedly long intro, let’s see how one of these migrations looks like. I will create one in my advanced_migrations module which you can also check out from Github. But first, let’s see the source data we are working with. To keep things simple, I have this CSV file containing product categories:

id,label_en,label_ro
B,Beverages,Bauturi
BA,Alcohols,Alcoolice
BAB,Beers,Beri
BAW,Wines,Vinuri
BJ,Juices,Sucuri
BJF,Fruit juices,Sucuri de fructe
F,Fresh food,Alimente proaspete

And we want to import these as taxonomy terms in the categories vocabulary. For now we will stick with the English label only. We will see after how to get them translated as well with the corresponding Romanian labels.

As mentioned before, the YAML file goes in the migrations folder and can be named advanced_migrations.migration.categories.yml. The naming is pretty straightforward to understand so let’s see the file contents:

id: categories
label: Categories
migration_group: advanced_migrations
source:
  plugin: csv
  path: 'modules/custom/advanced_migrations/data/categories.csv'
  header_row_count: 1
  keys:
    - id
  column_names:
    0:
      id: 'Unique Id'
    1:
      label_en: 'Label EN'
    2:
      label_ro: 'Label RO'
destination:
  plugin: entity:taxonomy_term
process:
  vid:
    plugin: default_value
    default_value: categories
  name: label_en

It’s this simple. We start with some meta information such as the ID and label, as well as the migration group it should belong to. Then we have the definitions for the 3 plugin types we spoke about earlier:

Source

Under the source key we specify the ID of the source plugin to use and any source specific definition. In this case we point it to our CSV file, and kind of “explain” it how to understand the CSV file. Do check out the Drupal\migrate_source_csv\Plugin\migrate\source\CSV plugin if you don’t understand the definition.

Destination

Under the destination key we simply tell the migration what to save the data as. Easy peasy.

Process

Under the process key we do the mapping between our data source and the destination specific “fields” (in this case actual Drupal entity fields). And in this mapping we employ process plugins to get the data across and maybe alter it.

In our example we migrate one field (the category name) and for this we use the Drupal\migrate\Plugin\migrate\process\Get process plugin which is assumed unless one is actually specified. All it does is copies the raw data as it is without making any change. It’s the very most basic and simple process plugin. And since we are creating taxonomy terms, we need to specify a vocabulary which we don’t necessarily have to take from the source. In this case we don’t actually because we want to import all the term into the categories vocabulary. So we can use the Drupal\migrate\Plugin\migrate\process\DefaultValue plugin to specify what value should be saved in that field for each term we create.

And that’s it. Clearing the cache, we can now see our migration using Drush:

drush migrate:status

This will list our one migration and we can run it as well:

drush migrate:import categories

Bingo bango we have categories. Roll them back if you want with:

drush migrate:rollback categories

Dynamic migration

Now that we have the categories imported in English, let’s see how we can import their translations as well. And for this we will use a dynamic migration using a “template” and a plugin deriver. But first, what are plugin derivatives?

Plugin derivatives

The Drupal plugin system is an incredibly powerful way of structuring and leveraging functionality. You have a task in the application that needs to be done and can be done in multiple ways? Bam! Have a plugin type and define one or more plugins to handle that task in the way they see fit within the boundaries of that subsystem.

And although this is powerful, plugin derivatives are what really makes this an awesome thing. Derivatives are essentially instances of the same plugin but with some differences. And the best thing about them is that they are not defined entirely statically but they are “born” dynamically. Meaning that a plugin can be defined to do something and a deriver will make as many derivatives of that plugin as needed. Let’s see some examples from core to better understand the concept.

Menu links:

Menu links are plugins that are defined in YAML files and which map to the Drupal\Core\Menu\MenuLinkDefault class for their behaviour. However, we also have the Menu Link Content module which allows us to define menu links in the UI. So how does that work? Using derivatives.

The menu links created in the UI are actual content entities. And the Drupal\menu_link_content\Plugin\Deriver\MenuLinkContentDeriver creates as many derivatives of the menu link plugin as there are menu link content entities in the system. Each of these derivatives behave almost the same as the ones defined in code but contain some differences specific to what has been defined in the UI by the user. For example the URL (route) of the menu link is not taken from a YAML file definition but from the user-entered value.

Menu blocks:

Keeping with the menu system, another common example of derivatives is the menu blocks. Drupal defines a Drupal\system\Plugin\Block\SystemMenuBlock block plugin that renders a menu. But on its own, it doesn’t do much. That’s where the Drupal\system\Plugin\Derivative\SystemMenuBlock deriver comes into play and creates a plugin derivate for all the menus on the site. In doing so, augments the plugin definitions with the info about the menu to render. And like this we have a block we can place for each menu on the site.

Migration deriver

Now that we know what plugin derivatives are and how they work, let’s see how we can apply this to our migration to import the category translations. But why we would actually use a deriver for this? We could simply copy the migration into another one and just use the Romanian label as the term name no? Well yes…but no.

Our data is now in 2 languages. It could be 23 languages. Or it could be 16. Using a deriver we can make a migration derivative for each available language dynamically and simply change the data field to use for each. Let’s see how we can make this happen.

The first thing we need to do is create another migration that will act as the “template”. In other words, the static parts of the migration which will be the same for each derivative. And as such, it will be like the SystemMenuBlock one in that it won’t be useful on its own.

Let’s call it advanced_migrations.migration.category_translations.yml:

id: category_translations
label: Category translations
migration_group: advanced_migrations
deriver: Drupal\advanced_migrations\CategoriesLanguageDeriver
source:
  plugin: csv
  path: 'modules/custom/advanced_migrations/data/categories.csv'
  header_row_count: 1
  keys:
    - id
  column_names:
    0:
      id: 'Unique Id'
    1:
      label_en: 'Label EN'
    2:
      label_ro: 'Label RO'
destination:
  plugin: entity:taxonomy_term
  translations: true
process:
  vid:
    plugin: default_value
    default_value: categories
  tid:
    plugin: migration_lookup
    source: id
    migration: categories
  content_translation_source:
    plugin: default_value
    default_value: 'en'

migration_dependencies:
  required:
    - categories

Much of it is like the previous migration. There are some important changes though:

  • We use the deriver key to define the deriver class. This will be the class that creates the individual derivative definitions.
  • We configure the destination plugin to accept entity translations. This is needed to ensure we are saving translations and not source entities. Check out Drupal\migrate\Plugin\migrate\destination\EntityContentBase for more info.
  • Unlike the previous migration, we define also a process mapping for the taxonomy term ID (tid). And we use the migration_lookup process plugin to map the IDs to the ones from the original migration. We do this to ensure that our migrated entity translations are associated to the correct source entities. Check out Drupal\migrate\Plugin\migrate\process\MigrationLookup for how this plugin works.
  • Specific to the destination type (content entities) we need to import a default value also in the content_translation_source if we want the resulting entity translation to be correct. And we just default this to English because that was the default language the original migration imported in. This is the source language in the translation set.
  • Finally, because we need to lookup in the original migration, we also define a migration dependency on the original migration. So that the original gets run, followed by all the translation ones.

You’ll notice another important difference: the term name is missing from the mapping. That will be handled in the deriver based on the actual language of the derivative because this is not something we can determine statically at this stage. So let’s see that now.

In our main module namespace we can create this very simple deriver (which we referenced in the migration above):

namespace Drupal\advanced_migrations;

use Drupal\Component\Plugin\Derivative\DeriverBase;
use Drupal\Core\Language\LanguageInterface;
use Drupal\Core\Language\LanguageManagerInterface;
use Drupal\Core\Plugin\Discovery\ContainerDeriverInterface;
use Symfony\Component\DependencyInjection\ContainerInterface;

/**
 * Deriver for the category translations.
 */
class CategoriesLanguageDeriver extends DeriverBase implements ContainerDeriverInterface {

  /**
   * @var \Drupal\Core\Language\LanguageManagerInterface
   */
  protected $languageManager;

  /**
   * CategoriesLanguageDeriver constructor.
   *
   * @param \Drupal\Core\Language\LanguageManagerInterface $languageManager
   */
  public function __construct(LanguageManagerInterface $languageManager) {
    $this->languageManager = $languageManager;
  }

  /**
   * {@inheritdoc}
   */
  public static function create(ContainerInterface $container, $base_plugin_id) {
    return new static(
      $container->get('language_manager')
    );
  }

  /**
   * {@inheritdoc}
   */
  public function getDerivativeDefinitions($base_plugin_definition) {
    $languages = $this->languageManager->getLanguages();
    foreach ($languages as $language) {
      // We skip EN as that is the original language.
      if ($language->getId() === 'en') {
        continue;
      }

      $derivative = $this->getDerivativeValues($base_plugin_definition, $language);
      $this->derivatives[$language->getId()] = $derivative;
    }

    return $this->derivatives;
  }

  /**
   * Creates a derivative definition for each available language.
   *
   * @param array $base_plugin_definition
   * @param LanguageInterface $language
   *
   * @return array
   */
  protected function getDerivativeValues(array $base_plugin_definition, LanguageInterface $language) {
    $base_plugin_definition['process']['name'] = [
      'plugin' => 'skip_on_empty',
      'method' => 'row',
      'source' => 'label_' . $language->getId(),
    ];

    $base_plugin_definition['process']['langcode'] = [
      'plugin' => 'default_value',
      'default_value' => $language->getId(),
    ];

    return $base_plugin_definition;
  }

}

All plugin derivers extend the Drupal\Component\Plugin\Derivative\DeriverBase and have only one method to implement: getDerivativeDefinitions(). And to make our class container aware, we implement the deriver specific ContainerDeriverInterface that provides us with the create() method.

The getDerivativeDefinitions() receives an array which contains the base plugin definition. So essentially our entire YAML migration file turned into an array. And it needs to return an array of derivative definitions keyed by their derivative IDs. And it’s up to us to say what these are. In our case, we simply load all the available languages on the site and create a derivative for each. And the definition of each derivative needs to be a “version” of the base one. And we are free to do what we want with it as long as it still remains correct. So for our purposes, we add two process mappings (the ones we need to determine dynamically):

  • The taxonomy term name. But instead of the simple Get plugin, we use the Drupal\migrate\Plugin\migrate\process\SkipOnEmpty one because we don’t want to create a translation at all for this record if the source column label_[langcode] is missing. Makes sense right? Data is never perfect.
  • The translation langcode which defaults to the current derivative language.

And with this we should be ready. We can clear the cache and inspect our migrations again. We should see a new one with the ID category_translations:ro (the base plugin ID + the derivative ID). And we can now run this migration as well and we’ll have our term translations imported.

Other examples

I think dynamic migrations are extremely powerful in certain cases. Importing translations is an extremely common thing to do and this is a nice way of doing it. But there are other examples as well. For instance, importing Commerce products. You’ll create a migration for the products and one for the product variations. But a product can have multiple variations depending on the actual product specification. For example, the product can have 3 prices depending on 3 delivery options. So you can dynamically create the product variation migrations for each of the delivery option. Or whatever the use case may be.

Conclusion

As we saw, the Drupal 8 migration system is extremely powerful and flexible. It allows us to concoct all sorts of creative ways to read, clean and save our external data into Drupal. But the reason this system is so powerful is because it rests on the lower-level plugin API which is meant to be used for building such systems. So migrate is one of them. But there are others. And the good news is that you can build complex applications that leverage something like the plugin API for extremely creative solutions. But for now, you learned how to get your translations imported which is a big necessity.

Some related articles you might enjoy

Comments

Hi Danny!

First off: thank you very much for your articles, those are always clear and detailed and great entrypoints to advanced concepts of Drupal (this is not sucking up, it's sincere ^^).

Maybe you could help me decide if idea I had in mind is good or bad.
Me and team are working on a project being "moving from Java-based, 7-year old techno to Drupal 8-based website".
This is a content-heavy site, for which editors had very little barriers and safeguards in the old engine. So the data is kinda a mess in many ways.

By chance, we have a DBA that took responsibility for creating a custom routine to export db and adapt it into a d8-structured db. At least we have current data to work on, which is nice. But for obvious reasons it's not very satisfying as is. Notably because everytime we see an inconsistency on data, we have to bubble up the information -> Long time to correct something although usually its simple enough that we could take care of it if we were on "living" platforms (it's currently recreated every day).

In your opinion, would it be a bad idea to use Migrate as a base to create "correction" routines? Using "same db" as a source (if necessary with 2 identical Drupal at start), several process each dedicated to a specific "correction" (removing tag in body, checking string format in business data), and destination "same database"?

I had no chance yet to really experiment Migrate by myself, but from my understanding it would allow...
- First, to have a relatively easy introduction to Migrate (because unless I misunderstand you can also only "extract", process and "import" a chunk of data and not whole content, since Migrate can "recognize updates to existing nodes".
So then we could expand progressively until we feel confident enough to tackle the "get data from old website" giant.
- Second, to use directly every Validation plugin we are bound to write to securize the writing of content in new platform.
- Third, to have a way to process data that facilitates controls over what happens and revert if necessary, with Gitable supervision of what is kept in the end in the process since it's all configuration files, and an easy way to work step by step (adding new process plugins once the others have been validated).

I ask because the other way I see this being done would be "simple" Drush command associated with a Queue Worker, which are other notions we have yet to master ^^.

But my feeling is that learning Migrate as early as possible is the best course of action for every "mass-scale data transaction/transformation", yet I'm afraid I'm trying to use it in a way too far from its "natural use-case".

Thanks in advance for reading and possibly giving hints ;)

Best regards,

 Laurent

Add new comment

You must have Javascript enabled to use this form.