EPiServer  /  Find November 05, 2012

New in EPiServer Find – Unified Search

“Unified search is a new concept in Find’s .NET API that aims to provide the benefits of indexing objects using the “dumbed down” least common denominator approach while still maintaining Find’s original power of indexing more or less full .NET objects. … It allows us to query objects as if they where implementing a common interface without actually having to modify the indexed objects.”

Last week EPiServer released the new version of their CMS, EPiServer CMS 7. While this new version contains a few nice but minor tweaks like a whole new editor UI, revamped API and multi channel mojo there was actually another, much more significant, release last week. The new version of the EPiServer Find .NET API and CMS integration!

Kidding aside the new version, which is available for download from EPiServer World as well as from EPiServer’s NuGet feed, features primarily tweaks and improvements to the CMS integration making Find even more seamlessly integrated with the CMS. It does have one rather interesting brand new feature though, a concept called Unified Search.

Executive summary

This is a long post in which I’ll discuss the Unified Search concept in quite some detail. If you don’t care or don’t have the time for that, here’s what you need to know.

In the new version of Find’s .NET API and CMS integration you can do this:

var result = SearchClient.Instance
    .UnifiedSearchFor("some search term")

The above code will search for both PageData objects and files in VPPs (UnifiedFile). Using a couple of lines of code you can also add other types that will be included in the search. The result object will be contain hit objects with a title and an excerpt, both which can be highlighted, as well as some other properties that are commonly used in search result listings.

What it is

EPiServer Find takes a new approach to search with the aim of harnessing the power of search engines for more than just search pages. It does this by allowing developers to index and query objects of existing classes, such as PageData objects, without having to map them to some intermediate type that is then put into the index. It also indexes the objects in such a way that developers can later query them in a number of ways using a fluent API that doesn’t require much in terms of special search engine skills. This is of course very powerful as it allows us to use Find both for free text search and for deterministic querying such as navigation and listings.

Most other search products doesn’t have this functionality but instead indexes some sort of least common denominator. That is, while the actual data being indexed may have widely different characteristics all content is indexed the same way out of the box. That is an article page, a recipe, a product and a user comment may all have title and a content field in the index. If we later want to distinguish between articles and recipes we’ll have to find a way to do so by filtering on parts of the URL or output some meta data that contains type information.

As you might have guessed I think Find’s approach is better as it doesn’t “dumb down” objects/content in order to indexed them allowing us as developers to query them almost as if they where in memory. The approach of using a least common denominator does have a couple of benefits though – it makes it easier to search over different, unrelated, types and it makes it easier build generic functionality for querying and displaying search results (ie helper methods for displaying stuff).

Of course these benefits typically come at a steep price – the unique characteristics of the indexed objects are lost and we have to spend time customizing how they are indexed, often by outputting a bunch of meta data in the markup, to get at least some of them into the index. Unified search is a new concept in Find’s .NET API that aims to provide the benefits of indexing objects using the “dumbed down” least common denominator approach while still maintaining Find’s original power of indexing more or less full .NET objects.

The problem

In order to build generic search functionality using search engines we typically need to index all content or objects with a base set of fields that is the same no matter of their original type. This typically forces us to map objects that should be indexed to an intermediate object that matches that base set of fields/properties.

Imagine we have two classes, A and B and A has a Title property and B has a Headline property and we want to search for instances of both types in both of those fields. We then have to make the search engine understand that Title and Headline is essentially the same. Using a crawler based search engine we’d do that by outputting both properties in a H1 tag. With other search solutions we’d have a third class, let’s call it IndexDocument. When instances of A and B should be indexed we’d provide code that would map instance of A to IndexDocument as well as code that could map instances of B to IndexDocument.

There must be a better way!

The drawback of the conventional way of indexing objects using search engines is that we have to map objects to something that they are not. This can be tedious work and, worse, we loose the ability to query them by type and by that type’s unique characteristics. Clearly imposing such limitations on Find’s rich query API would be a step back. So, how can we eat the cake and have it too? Well, taking a step back from search engines and looking at object oriented programming, how would we do it there? We’d use interfaces!

Using the example of classes A and B we wanted to map both types to the third class, IndexDocument, in order to easily search over both types. But at the same time we wanted to index the objects as their original types (A and B). With object oriented programming we could turn the IndexDocument class into an interface, IIndexDocument, and have both A and B implement that.

Of course Find’s .NET API already supports this as it indexes the full inheritance hierarchy of objects including interfaces. There’s just one problem – what if we aren’t able to modify the classes to or for some other reason don’t want them to implement a common interface? This is where Unified Search comes in. It allows us to query objects as if they where implementing a common interface without actually having to modify the indexed objects.  


The concept of Unified Search consists of four main parts:

  • A common interface declaring properties that we may want to use when building search (not querying) functionality – ISearchContent.
  • An object that maintains a configurable list of types that should be searched when searching for ISearchContent as well as, optional, specific rules for filtering and projecting those types when searching – IUnifiedSearchRegistry which is exposed by the IClient.Conventions.UnfiedSearchRegistry property.
  • Classes for search results that are returned when searching for ISearchContent – UnifiedSearchResults which contains a number of UnifiedSearchHit.
  • Special method for building and executing search queries – UnifiedSearch() and UnifiedSearchFor() and an overloaded GetResult() method.


The ISearchContent interface resides in the EPiServer.Find.UnifiedSearch namespace and is the least common denominator. It declares a large number of properties, all with names prefixed with “Search”, that we may want to search for. The properties ranges from common ones such as SearchTitle and SearchText to more specialized ones such as SearchGeoLocation and SearchAttachment (for files such as Word documents).


IUnifiedSearchRegistry, also residing in EPiServer.Find.UnifiedSearch but typically accessed by fetching it from the client’s conventions, exposes methods for building and configuring a list of types that should be included when searching for ISearchContent. Apart from methods for adding (the Add method) and listing types (the List method) it also declares methods that allow us to add rules for how specific types should be filtered when searching as well as how to project found documents to the common hit type (UnifiedSearchHit).

Unless we want to include some additional type or modify the rules for an already added type we typically don’t have to care about the registry as the CMS integration will automatically add PageData and UnifiedFile to it.

UnifiedSearchResults and UnifiedSearchHit

While ISearchContent provides a decent common denominator for fields to search in it wouldn’t be useful to get back instances of it as the result of a search query. For instance, while we may want to search in the full text in an indexed object we typically only want a small snippet of the text back which we’ll show in the search results listing. Also, it would be technically problematic to get instances of ISearchContent back from the search engine as the matched object doesn’t actually implement that interface, or at least they don’t have to.

Therefor, when we search for ISearchContent and invoke the GetResult method we won’t get back instances of ISearchContent. Instead we’ll get back an instance of the UnifiedSearchResults class which contains a number of UnifiedSearchHit objects. A UnifiedSearchHit object contains a number of properties that we typically would want to show for each search result, such as Title, Url and Excerpt (a snippet of text). It also has a number of other properties such as PublishDate, ImageUri, Section, FileExtension and TypeName.

Methods for building and executing queries

In order to search for ISearchContent we can use the regular Search method, ie client.Search<ISearchContent>(). In that case we’ll be in charge of what fields to search in when building free text search. However, since ISearchContent is a special type that the .NET API knows about, there are a couple of methods that takes care of adding some sensible defaults for us – UnifiedSearch() and UnifiedSearchFor().

More importantly the Unified Search concept also adds a new GetResult method. As this has the same name as the regular method for executing search queries we don’t really have to do anything special to use it. The compiler will choose to use it for us as it has a more specific generic type constraint than the other GetResult methods. But, we should be aware of what it does.

The GetResult method will modify the search query that we have built up so that it won’t just search for objects that implement ISearchContent but also for all types that have been added to the UnifiedSearchRegistry. It will also proceed to add a projection from ISearchContent to UnifiedSearchHit with some nice sensible defaults, along with any type specific projections that have been added to the UnifiedSearchRegistry. Finally, before executing the search query like the regular GetResult method, it will also add any type specific filters that have been added to the UnifiedSearchRegistry.

Once we invoke GetResult the search query will search over types that may not implement ISearchContent, but as we (hopefully) have specified that we should only search in, or filter on, a number of fields that are declared by ISearchContent we’ll only search in fields with those names, even if the objects don’t implement ISearchContent.

The GetResult method has an overload that requires an argument of type HitSpecification. Using this we can control the length of the excerpt, whether titles and excerpts should be highlighted, as well as a number of other things.

How it works

The unified search concept utilizes two key concepts: the fact that the type hierarchy is indexed for objects and the fact that the search engine doesn’t care what type declares a given field (property) as long as it has the expected name and type. Let’s take an example. Imagine we have added two classes A and B to the registry and have the below code.

    .Filter(x => x.SearchTitle.Prefix("A"))

This code will search for search the index for objects that either implement ISearchContent or which are of types A or B. It will then filter those requiring that they have a field named SearchTitle with a value that starts with “A”. Objects that implement ISearchContent will have such a field but A and B may not. If they don’t, the filter won’t match and instances of A and B won’t be returned. However, if they do have such a field, the search engine doesn’t care about why they have it. That is, it won’t care about the fact that they don’t have ISearchContent.SearchTitle. The type filtering is something separate and as long as they have a string field named SearchTitle it can filter on it.

Of course, in order for A or B objects to be included in the result of the above query we must add a SearchTitle field to them, but we don’t have to make the implement the full ISearchContent interface. Also, since the .NET API and the search engine doesn’t distinguish between properties and methods we can create an extension method for A or B and configure the client’s convention to include it when indexing instances of those types, thereby adding the SearchTitle field without modifying the classes.

In other words, we can think of unified search as something similar to mixins in object oriented programming. By adding types to the UnifiedSearchRegistry we can “mix in” that they should be included when searching for ISearchContent and by adding properties to the types and/or extension methods to them we can “mix in” some, or all, of the members of ISearchContent.

How to use it

Using unified search to search for CMS content, both pages and uploaded files, is easy. Simply create a search query using either the UnifiedSearch or UnifiedSearchFor methods invoked on the SearchClient. Execute the query using GetResult and do what you want with the result, typically iterate over each hit in a view.

using EPiServer.Find;
using EPiServer.Find.Framework;
using EPiServer.Find.Cms;
using EPiServer.Find.UnifiedSearch;

var result = SearchClient.Instance

foreach (UnifiedSearchHit hit in result)

If you want to customize what is indexed and/or returned for your CMS content you can either add a property with the same type and name as one of the properties in ISearchContent or create and include an extension method matching such a property. For instance, the CMS integration includes a default SearchSection method to PageData objects. If we don’t like that one, or want to modify it in some cases, we can add a property named SearchSection to our page type class.

public abstract class SitePageData : PageData
  public virtual string SearchSection
      var section = this.SearchSection();
      if (!string.IsNullOrWhiteSpace(section))
        return section;

      if (ParentLink.CompareToIgnoreWorkID(
        return PageName;

      return null;

    //Other properties

The same goes for SearchText, the CMS integration includes a default method which we can override by adding our own property:

public class NewsPage : StandardPage
  public string SearchText
      return MainBody.ToHtmlString(
  //Other properties

Type conditional filtering

So we’re able to search for objects that don’t implement ISearchContent as if they were. If they happen to have properties or included methods with matching names as properties declared in ISearchContent they’ll also be returned if we filter by them. But keep in mind that what’s stored in the object is still the full objects. This, and the fact that we can filter by type using Find, means that we can apply additional criteria to objects of some types using type conditional filtering.

For instance, let’s say we want to search for everything that is included as ISearchContent (typically PageData and UnifiedFile) and apply a filter. For PageData objects we also want to add an additional criteria. We can then do things like this:

  .Filter(x => 
    & (!x.MatchTypeHierarchy(typeof(PageData))
       | ((PageData)x).CreatedBy.Match("Joel"))

In the above code we first add a filter that requires objects to have a SearchTitle field with a value starting with “A”. We then also require the objects to either NOT be of type PageData OR, if they are, be created by someone named Joel.

Looks a bit complex? Yes. Something we’d do every day? Probably not. Powerful? I think so.

When to use it

The Unified Search concept is generally useful when you:

  • Build standard search pages that don’t require you to filter on type specific properties.
  • Build generic functionality and don’t know what types it will be used for.

The regular Find query API is better for:

  • Most querying scenarios. That is, content retrieval/navigations/listings that doesn’t involve free text search.
  • When you want to do fine grained and type specific filtering

Do I have to use it?

No. The regular, type specific, fluent, strongly typed querying API is still there and better than ever. Unified Search is just sugar on top.

WTF are you trying to say, I don’t get this mumbo jumbo!

That’s cool. There will be more hands on posts in the future about how to actually use Unified Search. In practice, Unified Search makes it even easier to use Find in some scenarios while this post looks under the covers and explains how it’s implemented.

PS. For updates about new posts, sites I find useful and the occasional rant you can follow me on Twitter. You are also most welcome to subscribe to the RSS-feed.

Joel Abrahamsson

Joel Abrahamsson

I'm a passionate web developer and systems architect living in Stockholm, Sweden. I work as CTO for a large media site and enjoy developing with all technologies, especially .NET, Node.js, and ElasticSearch. Read more


comments powered by Disqus

My book

Want a structured way to learn EPiServer 7 development? Check out my book on Leanpub!

More about EPiServer Find