Blog» eZ Find: How to return specific fields of indexed data

eZ Find: How to return specific fields of indexed data


When working with eZ Find fetches, you may want to return only a specific sub-set of data for each of the search results, rather than the whole content object.

You can do that by using the eZ Find 'search' fetch's 'fields_to_return' parameter.

Why use eZ Find 'search' fetch's 'fields_to_return'

Accessing index data not available on the content object by default

Each Solr result document holds information that is not part of the content object's attribute or meta information.
This includes some additional meta data, as well as custom data added during indexing e.g. via eZ Find's Index Time Plugin mechanism.

That information can be used to filter searches, but is not returned in the result object unless 'fields_to_return' is used.

Performance

It makes no sense to fetch whole content objects when you only need a few fields.

  • Do you want to show 100 or 200 or 300 event titles with event dates (both of these being data_map attributes in this case) per page? No problem1.
  • Do you want to get a large number of pre-filtered content object IDs to use in another fetch? No problem.
  • Do you want to create a large CSV set or data table quickly? No problem. (There is a much better way for the former case, see ...but wait, there's more!)

1 This still depends on server memory and available speed available, of course.

Depending on which information you need, you may be able to simply use the 'as_objects' parameter by itself. This will return a lighter version of the results object, which contains a lot of commonly used information e.g. name, published, is_visible, url_alias. Fetching these light results also allows you to return a much larger number of results at a time without having to worry about the memory cost too much.

In case you need attribute data or custom indexed data you will need to add 'fields_to_return' to the fetch, which will return a similarly light version of the results object, with the additional field information requested.

There is still some overhead here, as each result returned contains a number of meta fields and other information returned by default. Unless you really want to push it, using the standard fetch with 'fields_to_return' should be fine.

If you do want to push it, you can always go further, using the rawSolrQuery fetch.

How to use eZ Find 'search' fetch's 'fields_to_return'

Please note: The examples below are based on eZ Find LS 5.3, which uses Solr 4.7.
Older versions of eZ Find using Solr 3.x feature an older version of the admin interface accessible via http://<hostname>:8983/solr/admin
The query tool is available in all versions of the admin interface and the query syntax remains the same as well.

Here is an example fetch in an eZ Publish template:

{def
     $search_hash = hash(
        'query', '*',
        'class_id', 'secondary_content',
        'fields_to_return', array( 'meta_owner_name_t', 'attr_main_title_t' ),
        'as_objects', false(),
        'limit', 10
    )
    $search = fetch( 'ezfind', 'search', $search_hash )
}
{$search.SearchResult.0|dump( show, 2 )}

The result dump shows:

Attribute Type Value
guid string 'abc123'
installation_id string 'xyz987'
installation_url string 'http://solr_dev/'
name string 'About Us and Generic Content Page'
language_code string 'eng-CA'
owner_name string 'Administrator User'
id integer 123
main_node_id integer 124
published string '2015-11-23T04:06:26Z'
path_string array Array(1)
>0 string '/1/2/124/'
is_invisible array Array(1)
>0 boolean false
main_url_alias string 'About-Us-and-Generic-Content-Page'
main_path_string string '/1/2/124/'
fields array Array(1)
>attr_main_title_t string 'About Us and Generic Content Page'
highlight string ''
elevated boolean false

There are a few things to note about the fetch code, as well as the data returned by it:

  1. To be able to use the 'fields_to_return' parameter, you also need to set 'as_objects' to false().
  2. Field names for 'fields_to_return' are the actual Solr field names.

    Note: If you are unsure what fields are available, refer to the Solr Admin interface's Query tool, which can be reached via http://<your_hostname>:8983/solr/#/ezp-default/query on a default installation.
    Use a wildcard (*) for the 'fl' field to see all available fields for a query. You can also use the wildcard in partial field names e.g. meta_*

  3. meta_* fields will be accessible via keys on the search result array, but without the meta_ prefix or field type suffix. (e.g. meta_owner_name_t becomes owner_name).
  4. as_* fields (binary data) will be accessible via keys of the data_map array on the search result array, but without the as_ prefix or field type suffix.
  5. All other fields end up in the fields array on the search result array. Unlike the meta_ and as_ fields, however, the key for these fields is the actual field name used in the 'fields_to_return' parameter.

For further insight on how the eZ Find fetches work, you can refer to:

  • extension/ezfind/modules/ezfind/function_definition.php to see all available eZ Find fetch parameters
  • search(); in extension/ezfind/classes/ezfmodulefunctioncollection.php to see how the fetch parameters are handled
  • buildResultObjects(); in extension/ezfind/search/plugins/ezsolr/ezsolr.php to see how the search results object is created

Why isn't this documented?

I don't have an answer for that, but it brings up two important aspects of development in general:

  1. As a project maintainer: Keep your code documentation up to date, always. This is true for both inline as well as external docs.
     
  2. As a developer: Read the source code!

The first is arguably more important, but we all know good/complete documentation is, sadly, not common. The source code, however, gives you some insight into what's going on behind the scenes and, more importantly, will reveal to you functionality that you may not be aware of (yet).

Going further using rawSolrQuery

As the documentation states, the 'rawSolrQuery' fetch function: "Allows for “raw” Solr requests (not for normal use, but for example to search “foreign” Solr or Lucene indexes)."

You should keep that in mind, but don't let it stop you.

You will likely only need to use the 'rawSolrQuery' as an exception or for debugging. It shouldn't be your 'go to' solution.

The standard 'search' fetch in combination with 'as_objects' and 'fields_to_return' will get you similar data, while making use of the standard attribute filter and sort syntax, as well as the CMS permissions and visibility layers with only minor overhead. Using the standard fetches you'll also not have to deal with authentication and request parameter configuration as that is handled for you as part of the fetches.

{def
    $use_auth = ezini( 'SolrBase', 'SearchServerAuthentication', 'solr.ini' )|eq( 'enabled' )
    $auth_prefix = cond( $use_auth, concat( ezini( 'SolrBase', 'SearchServerUserPass', 'solr.ini' ), '@' ), false(), '' )
    $raw_base_url = ezini( 'SolrBase', 'SearchServerURI', 'solr.ini' )|explode( '://' )|implode( concat( '://', $auth_prefix ) )
 
    $query = '*'
    $raw_hash = hash(
        'baseURL', $raw_base_url,
        'request', '/select',
        'parameters', hash(
            'q', $query,
            'rows', 10,
            'fq', 'meta_class_identifier_ms:secondary_content',
            'fl', 'meta_owner_name_t,attr_main_title_t'
        )
    )
    $raw = fetch( 'ezfind', 'rawSolrRequest', $raw_hash )
}
{$raw.response.docs|dump( show, 3 )}

The result doc dump shows:

Attribute Type Value
0 array Array(2)
>meta_owner_name_t string 'Administrator User'
>attr_main_title_t string 'About Us and Generic Content Page'

As you can see, all the default fields present in the standard 'search' fetch are gone now, giving you only the fields requested.

It's important to note that the 'rawSolrRequest' fetch returns a different result structure than the standard fetch. Dump $raw to get an overview of what's available.

The fetch example starts off by determining if Solr is using authentication. Then it creates the base URL used in the fetch by determining the protocol as well as the authentication credentials used, based on the settings in eZ Find's solr.ini

$use_auth = ezini( 'SolrBase', 'SearchServerAuthentication', 'solr.ini' )|eq( 'enabled' )
$auth_prefix = cond( $use_auth, concat( ezini( 'SolrBase', 'SearchServerUserPass', 'solr.ini' ), '@' ), false(), '' )
$raw_base_url = ezini( 'SolrBase', 'SearchServerURI', 'solr.ini' )|explode( '://' )|implode( concat( '://', $auth_prefix ) )

The fetch itself takes three parameters - 'baseURL', 'request', and 'params':

  1. 'baseURL' is as described above; the URL eZ Find will make its request against Solr
  2. 'request' is the type of request made
  3. 'params' is a hash of Solr request parameters2
    1. 'q' is the query string
    2. 'start' is 'offset' in the 'search' fetch
    3. 'rows' is 'limit' in the 'search' fetch
    4. 'fq' is 'filter' in the 'search' fetch
    5. 'fl' is the field list; equivalent to 'fields_to_return' in the 'search' fetch

2 For a full list of parameters check the Solr Admin interface's Query tool.

$raw_hash = hash(
    'baseURL', $raw_base_url,
    'request', '/select',
    'parameters', hash(
        'q', $query,
        'start', 0,
        'rows', 10,
        'fq', 'meta_class_identifier_ms:secondary_content',
        'fl', 'meta_owner_name_t,attr_main_title_t'
    )
)

Note: 'rawSolrQuery' does not support the use of the 'wt' parameter, which is used to change the return data type.

rawSolrQuery for debugging

The Solr admin interface is the best tool to debug your data and query problems, but you may not always have access to it for security reasons.

In such cases the 'rawSolrQuery' lets you quickly and effectively see what data is available and any problems with the query used to retrieve it. As long as you have access to a template to put your debug query, you're set.

... but wait! There's more

Results of eZ Find fetches are returned as PHP arrays. Solr, however, is able to return result data in a number of different formats: JSON, XML, Python, Ruby, PHP, and CSV.

The result data type is controlled via the 'wt' parameter on a Solr select request.

Our example fetch URL run via the Solr Admin interface would look like this:

http://<your_hostname>:8983/solr/ezp-default/select?q=*&wt=csv&fq=meta_class_identifier_ms%3Asecondary_content&start=0&rows=10&fl=meta_owner_name_t%2Cattr_main_title_t

And would return:

meta_owner_name_t,attr_main_title_t
Administrator User,About Us and Generic Content Page

Not too surprising or exciting, until you urgently need to create large reports on your data structure as CSVs, which this type of request handles with ease. A few thousand rows at a time!


Related Blog Posts

Extending eZ Find: How to specify minimum relevance values using Solr frange queries

eZ Find, the enterprise search extension for eZ Publish and wrapper for the Apache Solr search engine, is a highly performant alternative to manually...

Read more »

Sorting search results by date-based relevance in eZ Find

The eZ Publish search extension eZ Find offers many sorting parameters for its search results, the most common being by relevance / score, and by date....

Read more »

Adding data to the eZ Find index with Index Time Plugins

Index time plugins are one of the most important techniques of extending eZ Find functionality; they allow you to control how and what data is indexed....

Read more »

Securing eZ Find's Solr installation

When using eZ Publish's eZ Find extension on a public facing site or project -- arguably any project -- it is vital to secure it to prevent unauthorized...

Read more »

Saved searches and e-mail alerts in eZ Publish

A powerful addition a normal site search is for registered users to be able to save their searches, share the searches with others, and create customized...

Read more »

eep in action: eZ Publish command line operations

eep (Ease eZ Publish) is a command line tool we introduced in a previous post, which in combination with other command line tools like awk, grep

Read more »

Comments

blog comments powered by Disqus

Hi, we're Mugo Web - Nice to meet you!

We're a group of web experts who solve complex web problems.

Learn more about us »

Search


Categories


Yes - we can do that.

We can do that

Many years of experience with complex websites allows us to offer total solutions.

Learn more about what we can do »

We love our clients (and they love us too)

Collage of logos : American express, Habitat, Car and Driver, Rasmussen, and American museum of Natural History

We've solved problems across North America and around the world.

Learn more about what we've done »

We tweet too

Follow us on Twitter for the latest Mugo happenings

mugo twitter page @mugo

© 2008 - 2017 Mugo Web. All rights reserved.