Overview of SAS OLAP Cube Studio
What Is OLAP ?
OLAP (Online Analytical Processing) is a software technology that enables users to dynamically analyze data that is stored in multidimensional database tables.
Because of the way the data and its relationships are stored, they are readily available for detailed queries and analysis.
Central to the OLAP storage process are cubes. A cube is a set of data that is organized and structured in a hierarchical, multidimensional arrangement, often with numerous dimensions and levels of data.
The classification information in the data is organized into dimensions.
For example, a TIME dimension might have all input columns related to time (date, month, year) grouped as levels.
Dimension levels can then be organized into a hierarchy. The hierarchy determines the logical order of the levels in a dimension. The logical order makes it possible for users to drill down into the cube data.
Cubes also contain measures, which are based on numeric analysis columns in the input tables.
These numeric values are summarized and stored in the cube for quick access and analysis during a query.
Examples of measures include sales figures and operational costs.
What Is SAS OLAP Cube Studio?
The SAS OLAP Cube Studio is a Java interface for defining and building OLAP cubes in SAS®9.
The Cube Designer wizard, which guides the user through the process of creating and registering a cube, is the main feature of SAS OLAP Cube Studio and is also available in SAS ETL Studio.
Other tasks that can be performed using the SAS OLAP Cube Studio include
- defining OLAP schemas in a SAS Metadata Repository
- registering library definitions for the tables used to build cubes
- registering the tables used to build cubes
- editing cube structure
- deleting cubes
- adding and deleting aggregations
- viewing cube properties.
Queries in SAS OLAP Cube Studio are performed using the industry standard MDX language (multidimensional expression language), which is similar to SQL but can use cubes as input. This provides improved performance over previous releases of SAS.
Interaction with SAS Application Servers
SAS OLAP Cube Studio uses different types of application servers:
When the OLAP user requests a piece of metadata, the SAS Metadata Server handles
- accessing metadata from the repository
- providing the metadata to the SAS OLAP Cube Studio client
- writing the metadata back out to the repository.
The SAS OLAP Server is only used for viewing cube data, not for building cubes. Neither SAS OLAP Cube Studio nor SAS ETL Studio have an OLAP viewer for viewing cubes; therefore, the SAS OLAP Server does not have to be running when using either of these client tools.
NWAY Aggregation Concept
OLAP cubes usually contain an NWAY aggregation. The NWAY aggregation can be thought of as a table that contains the crossing of all dimension levels of the OLAP cube.
Example: Examine the concept of the NWAY aggregation by discussing a summary table created from the SUMMARY procedure.
Sales Data Table:
The values shown above for Sales Person are shortened on future slides for space consideration reasons.
The Sales Data table can be summarized using the SUMMARY procedure, and the summary information output to a new table called Sales Summary.
The SUMMARY procedure provides data summarization tools that compute descriptive statistics for variables across all observations or within groups of observations. By default, PROC SUMMARY does not display output. PROC SUMMARY is similar to the MEANS procedure.
A simple listing of the new summary table, Sales Summary, can be created with a PROC PRINT step.
Sales Summary Table
The output data set created by PROC SUMMARY contains two special variables:
_TYPE_ identifies the combination of class variables for which summarization is done.
_FREQ_ indicates the number of observations included in that level of summary.
The _TYPE_ variable values display different types of summarizations.
Review _TYPE_=0 (observation 1) in the Sales Summary table.
This observation displays summarizations for no particular SalesPerson OR Month; therefore, this summarization is overall summarizations for the entire data set.
Review _TYPE_=1 (observations 2 – 4).
These observations display summarizations for each distinct Month across all SalesPerson values.
Review _TYPE_=2 (observations 5 – 6).
These observations display summarizations for each distinct SalesPerson across all Month values.
Review _TYPE_=3 (observations 7 – 12).
These observations display summarizations for each distinct SalesPerson broken down by Month values.
For the _TYPE_=3 observations, notice that observations 7 through 9 can be summarized to obtain the values for observation 5.
Similarly, observations 10 through 12 can be summarized to obtain the values for observation 6.
In addition to the previous example, notice that the _TYPE_=3 observations 7 and 10 can be summarized to obtain the values for observation 2.
Similarly, observations 8 and 11 can be summarized to obtain the values for observation 3, and observations 9 and 12 can be summarized to obtain the values for observation 4.
If the NWAY option is used in the PROC SUMMARY statement …
… the generated output table contains only the values where _TYPE_=3.
The NWAY aggregation in an OLAP cube is similar to the output table that is generated from a PROC SUMMARY step using the NWAY option.
All other aggregations in an OLAP cube can be generated from the NWAY aggregation.