Oracle Endeca Commerce Tutorial
Oracle Endeca Tutorial
This tutorial gives you an overview and talks about the fundamentals of Oracle Endeca.
Endeca the company has been around since 1999 and was based in Cambridge, MA. For most of it’s time, Endeca has focused on the e-Commerce market, providing the technology behind eCommerce websites, what all these sites have in common is a feature called “faceted search”, where users are able to query and drill-into the retailer’s dataset, using any combination of attributes.
Endeca focused on this e-commerce market first, and developed the MDEX engine to support this, marketing it as a column-store, rapid-development query engine that allows “faceted searches” across lots of different, “jagged” data sets (i.e. data sets that don’t have the same data model, but with some commonality between them). About a couple of years ago, Endeca took the core technology from this product and created a standalone BI tool called Endeca Latitude, complete with dashboard components, an ETL tool, dashboard and report designers, and a story that revolved around “agile BI”, based on the fact that the MDEX engine doesn’t require a strictly-defined data model. By the time of the Oracle acquisition, Endeca’s product line looked like this, with Endeca Infront, the product behind these websites, accounting for the majority of revenue.
- Endeca technology primarily targets the following objectives.
- Deliver targeted, user-centric experiences in a scalable way
- Ingest any type of data from any source to power smarter, richer experiences
- Guide and influence customers at each step of their experience to encourage exploration and discovery
- Allow business users to change the customer experience and dynamically push updates—without engaging IT
- Extend and manage the full breadth of Oracle Endeca’s Web commerce capabilities into multiple channels.
Endeca Commerce Components
Oracle Endeca Commerce is comprised of three major components.
These components are:
- Endeca Information Transformation Layer (ITL)
- Endeca MDEX Engine
- Endeca Application Tier
Endeca Information Transformation Layer (ITL)
- Reads your raw source data and manipulates it into a set of Oracle Endeca MDEX Engine indices.
- The ITL consists of the Content Acquisition System (which includes the Endeca CAS Server and Console, the CAS API and the Endeca Web Crawler), and the Data Foundry (which includes data-manipulation programs such as Forge).
Endeca MDEX Engine
- The query engine that is the core of Oracle Endeca Commerce which consists of the Indexer (Dgidx), the Dgraph, and the Agraph.
- The MDEX Engine loads the indices generated by the indexing component of the Endeca Information Transformation Layer.
- Although the Indexer (also known as Dgidx) is installed as part of the MDEX Engine package, in effect it is part of the ITL process.
Endeca Application Tier
- After the indices are loaded, the MDEX Engine receives queries from the Endeca Application Tier, executes them against the loaded indices, and returns the results to the client application.
- The Application Tier provides an interface to the MDEX Engine via the Endeca Assembler. The Assembler acts as a language-agnostic interface for aggregating and sending queries to the MDEX Engine, and executing any necessary post-processing on the results.
- Oracle Endeca Commerce uses two types of queries: navigation queries and keyword search queries.
- Navigation queries return a set of records based on application-defined record characteristics (such as wine type or region in an online wine store), plus any follow-on query information.
- Keyword search queries return a set of records or dimensions based on a user-defined keyword, plus any follow-on query information. For more information, see “Using Keyword Search.”
- Endeca records are the entities in your data set that you are navigating to or searching for.
- Records are the fundamental units of data.
- Attributes are the fundamental units of a record schema which describes the data model of Records.
Dimensions and dimension values
- Dimensions provide the logical structure for organizing the records in your data set.
- A dimension is a collection of related dimension values, organized into a tree. The top-most dimension value in a dimension tree is known as the dimension root. A dimension root always has the same name as its dimension.
- A dimension is a collection of related dimension values, organized into a tree.
- The top-most dimension value in a dimension tree is known as the dimension root.
- A dimension root always has the same name as its dimension.
- Endeca properties are the basic attributes of an endeca record.
- Are usually generated from a record’s source properties, using source property mapping.
- Consist of key/value pairs (property name/property value).
- Can be searched and displayed.
The primary difference between Endeca properties and dimensions is that the MDEX Engine indices for dimensions support navigation, while those for Endeca properties do not.
- Dimension hierarchy gives you additional control over the logical structure used to organize your Endeca records.
- As the term “dimension tree” implies, dimension values can have parent and child dimension values.
- A dimension value that has sub-dimension values is the parent of those sub-dimension values. The sub-dimension values themselves are children or child dimension values.
Child dimension values of the same parent dimension at the same level of hierarchy are dimension value siblings.
The one parent rule
In addition to parents, dimension values can have ancestors.
- Ancestors represent the dimension values between the dimension root and your current location in the dimension tree (technically, a parent is an ancestor).
- In the example below, Other and Fortified represent the ancestors for the Sherry dimension value.
Guided Navigation Basics
In its most basic form, a navigation query is a combination of one or more dimension values. These dimension values are referred to as the navigation descriptors.
A navigation query instructs the Endeca MDEX Engine to return the set of records that represents the intersection of all the dimension values that it contains.
Guided Navigation is the presentation of valid follow-on refinement queries to the user.
For example, in the illustration below, Bottle A represents the intersection between the Red and USA dimension values. Bottle C represents the intersection between the White and France dimension values.
- A record search query is Endeca’s equivalent to full-text search.
- Record searches return the following:
- A set of records based on a user-defined keyword(s).
- Follow-on query information, based on the returned record set.
- Record search queries are performed against a particular property or dimension, also known as the search key. In order to perform a record search, the application’s user
- Chooses an Endeca property or dimension to act as the search key.
- Specifies a term, or terms, to search for within the key.
- In addition to record search, the Endeca MDEX Engine supports a second type of keyword search called dimension search.
- Dimension search queries return dimension values that have names that contain search term(s) the end user has specified.
- Unlike record search, dimension search does not require a search key.
- Dimension search always searches any dimension values that have been identified as searchable for the terms provided.
Comparing dimension search and record search
- Dimension search and record search each have their own strengths.
- In general, you should:
- Use dimension search when the search terms are included in your dimension hierarchy.
- Use record search when you want to search unstructured data that is not part of the dimension hierarchy.
Additional search features
There are other search features that you can incorporate into your application.
- Spelling functionality enables search queries to return expected results even though the user has misspelled the search term.
- Did You Mean functionality allows you to provide suggestions for further record searches to your users. Did You Mean is very useful when dealing with misspelled words or words that have exact matches that may not be as appropriate as other more popular alternatives.
- Stemming and Thesaurus allow your Endeca application to consider alternate forms of individual words as equivalent for the purpose of search querying.
For example, in many applications it is desirable for singular nouns to match their plural
equivalents in the searchable text, and vice versa. This is an example of Stemming
- The Thesaurus feature allows the system to return matches for related concepts to words or phrases contained in user queries. For example, one might configure a thesaurus entry to allow searches for cell phone to match text containing the phrase mobile phones
Relevance Ranking allows you to control the order in which results are returned. In particular, it is typically desirable to return results for the actual user query ahead of results for stemming and/or thesaurus transformed versions of the query.