Snowflake and star schema in qlikview
Snowflake and Star Schema in Qlikview
Star and Snowflake Schema are the two data models in qlikview.
In STAR schema all the facts are stored in one central table and using Primary key and foreign key relationship other Dimension tables are connected with the fact table.
SNOWFLAKE schema is a structure where we can have more than one fact table or Dimension tables aggregated in higher level. Generally used to avoid complexity (easy to understand) and Create more normalize structure.
Star Schema has a single fact table connected to dimension tables and it visualize as a star. In a star schema only one link establishes the relationship between the fact table and any of the dimension tables. It is a relational database schema for representing multidimensional data. It is the simplest form of the data warehouse schema that contains one or more dimensions and fact tables. It is called a star schema because the entity-relationship diagram between dimensions and fact tables resembles a star where one fact table is connected to multiple dimensions. The center of the star schema consists of a large fact table and it points towards the dimension tables. The advantage of star schema is slicing down, performance increase and easy understanding of data.
Snowflake Schema is an extension of the star schema. In this model, dimension tables are not necessarily fully flattened. Here, very large dimension tables are normalized into multiple sub dimensional tables. It is used when a dimensional table becomes very big. Also, every dimension table is associated with sub adimension table and has multiple links. A snowflake schema is a term that describes a star schema structure normalized through the use of outrigger tables. i.e dimension table hierarchies are broken into simpler tables.
- In a star schema every dimension will have a primary key.In a star schema, a dimension table will not have any parent table.
- Whereas in a snowflake schema, a dimension table will have one or more parent tables.
- Hierarchies for the dimensions are stored in the dimensional table itself in a star schema.
- Whereas hierarchies are broken into separate tables in snow flake schema. These hierarchies helps to drill down the data from topmost hierarchies to the lowermost hierarchies.
A star schema data model typically contains main fact tables with dimensional tables joined to it via primary keys. The resulting data model has a hub and spoke, or star-like appearance when visualized. Star schemas are commonly used for Online Analytical Processing (OLAP) purposes, mostly for the high speed that they offer. QlikView best practices dictate that developers should model data into a star schema, for the greatest amount of efficiency and speed.
Star schema gives you better performance. It includes better response time, a script run time and data model is also very flexible.
Star schemas are highly denormalized data models, free of the limitations of traditional, normalized Online Transactional Processing (OLTP) source systems used for creating, inserting, deleting and modifying records (these are transactions). Typical normalized databases used for OLTP are more strict in terms of data quality (such as no repeating data being stored), but are slower to respond to queries.
The following diagram is an example of a data model in a star schema format:
A snowflake schema is an extension of the star schema. In the snowflake schema, the data model may have one or more fact tables, with connected dimension tables, but will also have secondary dimension tables radiating from one or more primary dimension tables. Pure star schemas in large systems or companies are somewhat rare; snowflake schemas are, the more commonly encountered scenarios due to multiple fact tables and more complex and multiple underlying data sources.
More about the pros and cons of manipulating your data model, the star and snowflake schema is discussed during our Mindmajix Qlikview training.
The following diagram is an example of a data model in a snowflake schema format:
In cases where there is a snowflake schema, it may be useful to combine the tables via the CONCATENATE statement. This will add rows from one table to another. The CONCATENATE statement can be used to combine either fact tables or dimension tables. You can also JOIN the tables based on the key field.
The main difference between star schema and snowflake schema is that
- The star schema is highly denormalized whereas snowflake schema is normalized. .
- Performance wise, star schema is good, but if we consider memory, then snow flake schema is better than star schema.
- A dimension table will not have a parent table in star schema, whereas snowflake schemas have one or more parent tables.
- Snow Flake Schema has a bottom-up appraoch where as Star has Top-down approach towards it.
- Star Schema has fewer joins and Snow flake has more joins.