In QlikView, the relational database schemas are of two types. One is the Star schema and the other is Snowflake schema. But, before going to learn the two different schemas, first, we should know what kinds of tables were used to store the data?
So, there are two types of tables:
- Fact table
- Dimension table
The Fact table is a table that contains numeric data or measurable attributes of data such as ID, keys, etc, that are connected with the dimension table across the data model. The fact table is placed at the center of Star schema or the Snowflake schema bounded by dimension tables in a data model.
The Dimension tables are those tables which consist of descriptive or the textual attributes of data such as Product ID, Product name, Manager ID, Manager name, etc., Each dimension table describes the data of that particular table. For e.g., If a dimension table gives information about a product, it provides the data particularly on that product itself.
In the figure, the highlighted ones are the Dimension tables and the one which is connected with all the dimension tables is the Fact table. As a title of this article, we will learn What is a Star schema? What is a Snowflake schema? And What are the difference between them? Let’s get started.
What is Star Schema?
A star schema has a single fact table that connects with all the dimension tables based on their links. By its name, it resembles a star having a fact table at its center and surrounded by all dimension tables. It is known as a star schema because it represents the entity-relationship diagram between a Fact table and dimension tables that is where one fact table is connected with multiple dimension tables. In this schema, every dimension table contains a primary key but they will not have any parent table.
A star schema data model consists of one main Fact table that is connected with multiple dimension tables through the primary keys. This type of schema is commonly used for Online Analytical Processing (OLAP) that provides high speed. The resulting star schema has a spoke, or hub, or a Star like representation.
Simply, a star schema is a structure where through dimension tables are connected with the center situated Fact table. The fact table consists of foreign keys or the primary keys of all the dimension tables.
Subscribe to our youtube channel to get new updates..!
As shown in the figure, the center table (Sales details) represents the Fact tables and tables connected across the fact table are the dimension tables (Product details, Place details, customer details, order details).
E.g. Star schema with details of sales as Fact table and other details regarding sales like Product details, place details, Customer details, and Order details as Dimension tables.
Related Article: QlikView Interview Questions
Advantages of Star schema
The advantages of star schema are as follows:
A star schema reduces the time required to load a large amount of data into the database
It provides efficient navigation through data yet the dimension tables are joined to the fact table
Queries run faster in a star schema
It is designed to enforce the relation between accuracy and consistency of loaded data
Disadvantages of star schema
The disadvantages of star schema are as follows:
It is not flexible in terms of analytical requirement as a normalized data model
Data integrity is not executed as well as it is a highly normalized database
What is a Snowflake Schema?
A Snowflake schema is an enhancement of a star schema where every point of a star multiplies into several points. As we know, in star schema each dimension is represented by a single dimension table, But in Snowflake schema, that dimension table is standardized into numerous lookup tables. That means a dimension table is further linked to the Sub dimension table through multiple links. It is used when the dimension table becomes very big.
In this schema, a dimension table will have one or more parent tables. The hierarchies are divided into distinct tables and these hierarchies assist to drive down the data from top-most hierarchy to bottom-most hierarchy.
The Snowflake schema data model consists of one or more fact tables that are connected with multiple dimension tables as a star schema, and these dimension tables are further connected with the Sub dimension tables depending on the data scaling. As shown in the figure below, the fact table is connected with all the dimension tables based on their primary keys and some of the dimension tables are further linked to the Sub dimension table.
Unlike star schema, the Snowflake schema organizes the data inside the database in order to eliminate the redundancy and thus helps to reduce the amount of data. This kind of schema is commonly used for multiple fact tables that were a more complex structure and multiple underlying data sources.
E.g. the fact table is connected with the Location table (Dimension table) and that the Location table is further connected to the Location details table (Sub dimension table).
[Related Article: Top Data Warehousing Tools]
Advantages of the Snowflake schema
The advantages of snowflake schema are as follows:
It helps to increase flexibility
It improves query performance due to minimized disk storage requirement and connecting small lookup tables
It is easy to maintain
Disadvantages of the Snowflake schema
The disadvantages of Snowflake schema are as follows:
Due to multiple tables query performance might be reduced
It is required to perform more maintenance efforts because of more lookup tables
Difference between Star schema and Snowflake schema
|Description||Star schema||Snowflake schema|
|Data model||Top-down approach||Bottom-up approach|
|Normalization/ Denormalization||The fact table and Dimension tables are in the Denormalized form.||The fact tables are in Denormalized form, but Dimension tables are in normalized form.|
|Ease of use||Easy to understand and low query rate.||It is a complex structure and not as easy to understand.|
|Ease of maintenance||It has redundant data and less easy to maintain and change.||No redundancy. Hence, Snowflake is easier to maintain and change.|
|Dimension table||It contains only a single dimension table for each dimension.||It contains more than one dimension table for each dimension depending on the data.|
|Query performance||Less number of foreign keys and takes less time for execution.||More number of foreign keys and takes a long time for execution.|
|Joins||Less number of joins||More number of joins|
|Application||We can prefer the star schema when the dimension table has less number of rows.||We can prefer the snowflake schema when the dimension table is relatively big. So, this schema helps to reduce the size of the data.|
In QlikView, the preferred schema is the star schema as it provides queries that run faster. The structure allows us to load large batches of data quickly. It is a flexible model and deals with complex scripts easily. It also provides good RAM consumption. But selecting a suitable schema depends on the quantity of data. In this article, we have learned the core details of Star schema and Snowflake schema. Hope you find relevant information.