Profile picture for user admin
Daniel Sipos
10 May 2016

Have you ever needed to run a custom query that returns all the nodes which reference two or more taxonomy terms? I have, and most of the time it was very simple because the requirement was for nodes which reference either term, rather than ALL the terms.

When we are dealing with an OR type of select query, it couldn't be easier:

$query = db_select('node', 'n');
$query->join('taxonomy_index', 'ti', 'ti.nid = n.nid');
$query->condition('ti.tid', array(1, 2));
$query->fields('n', array('nid'));
$result = $query->execute();

The above example keeps things simple by assuming the term reference maintains an index in the taxonomy_index table and that we already have our term IDs.

Things get a bit more complicated when we are trying to query the nodes which reference both term 1 AND term 2. After banging my head against the wall for a bit, I came up with a solution inspired by how Views generates its query:

$tids = array(1, 2);
$query = db_select('node', 'n');
foreach ($tids as $key => $tid) {
  $query->innerJoin('taxonomy_index', 'ti_' . $key, 'ti_' . $key . '.nid = n.nid AND ti_' . $key . '.tid = ' . $tid);
}
foreach ($tids as $key => $tid) {
  $query->condition('ti_' . $key . '.tid', $tid);
}
$query->fields('n', array('nid'));
$result = $query->execute();

So basically the solution is to create a join for each term ID we want to filter by. A crucial part of this is that the join needs to happen on the node ID between the two tables (as expected) but also on the term ID we are using for the filter matching the record in the taxonomy_index table. This will ensure that the two joins are not the same but that they reflect the relation between the node record and each individual term ID. Are you still following? Then, we just add our conditions on the newly created joins.

Caveats

This should work, and depending on the size of your dataset, it should not have too much of a performance impact. However, as no joins should be done if not absolutely necessary, investigate the possibility of querying for the nodes that reference the first term and then filtering the rest out using PHP. This can be a viable option if you know that usual result sets are not too big so you are not running array_filter() on 2000 terms. And of course, when possible, cache the query appropriately.

Profile picture for user admin

Daniel Sipos

CEO @ Web Omelette

Danny founded WEBOMELETTE in 2012 as a passion project, mostly writing about Drupal problems he faced day to day, as well as about new technologies and things that he thought other developers would find useful. Now he now manages a team of developers and designers, delivering quality products that make businesses successful.

Contact us

Comments

ethan moore 26 Jul 2016 19:16

bulk mail servers

Thank you for sharing these great sources and the information.

najib 25 Jul 2017 16:53

thanks bro

this article is verry helpfull is helped me to solve my problem thank you here is my email if you wanna some help or we can discuss later

Add new comment