Mugo Web main content.

Making custom content scripts more efficient in eZ Publish

By: Peter Keung | January 11, 2013 | eZ Publish development tips and Site performance

For those who write long-running scripts in eZ Publish to perform operations (move, rename, update, and so on) on many content objects, here are a couple of quick tips to speed up the scripts and make them more efficient.

A few examples of custom scripts you might write in eZ Publish:

  • Periodic data imports of new products
  • Recategorization of reviews into different categories
  • Notifying customers that their subscriptions are going to expire

In each case you are potentially working with a lot of data; if you use default fetch functions and production settings on their own, you are susceptible to having unnecessarily slow and resource-intensive scripts. 

1. Looping through large data sets and memory management

By default, when you fetch content, eZ Publish will cache the object results in memory. If you are working with large amounts of content, you can quickly exhaust the available memory. To solve this problem, you can fetch your data in smaller chunks and clear the in-memory cache between loops. Here is a generic example to perform an operation on all user content objects:

$parentNodeID = 5;
$limit = 100;
$parameters = array( 'parent_node_id' => $parentNodeID
                    ,'class_filter_type' => 'include'
                    ,'class_filter_array' => array( 'user' ) );
// Get the total user count
$count  = eZFunctionHandler::execute( 'content', 'tree_count', $parameters );
$offset = 0;
$parameters['limit'] = $limit;

while( $offset <= $count )
    $parameters[ 'offset' ] = $offset;
    $users = eZFunctionHandler::execute( 'content', 'tree', $parameters );
    foreach( $users as $user )
        // Perform the necessary operation on the users directly, or store information about what should be done outside the loop

    // Increment the offset until we've gone through every user
    $offset += $limit;

    // Clear the eZ Publish in-memory object cache

Note that eZContentObject::clearCache(), run on every loop, essentially does the following:

unset( $GLOBALS['eZContentObjectContentObjectCache'] );
unset( $GLOBALS['eZContentObjectDataMapCache'] );
unset( $GLOBALS['eZContentObjectVersionCache'] );

2. Disabling caching in the context of the script

In a production environment, you should have all caching turned on: perhaps most importantly within eZ Publish, the view cache. However, this means that every time you modify content (create, update, move, and so on) the system will execute all of the relevant view cache clearing rules (for example, it might clear the cache for an article and all of its related articles). This can slow down the script by a factor of 2 or more and can slow down the site in general. If you are operating on many content objects in the same script, it is often preferable to turn off caching in the context of the script, and then clear the necessary cache after the script is finished. To implement this, you can paste this code at the top of your script:

eZINI::instance()->setVariable( 'ContentSettings', 'ViewCaching', 'disabled' );
eZINI::instance()->setVariable( 'ContentSettings', 'StaticCache', 'disabled' );
eZINI::instance()->setVariable( 'ContentSettings', 'PreViewCache', 'disabled' );

If you do not use static cache or pre-view cache then only the first line is necessary.

When the script is done running, you can then clear the content cache for the site as a whole, or for specific subtrees, or for specific objects:

eZContentCacheManager::clearContentCacheIfNeeded( $objectID );


On a related note, be sure to check out our eZ Publish command line tool "eep" to help you explore and manipulate eZ Publish content even more efficiently!