Mindmajix

Slowly Changing Dimensions SSIS Step by Step

Slowly Changing Dimensions

To process the data from granularity tables to main tables, we follow a mechanism called slowly changing dimensions type.

Eg:  there is a customer table where it holds customer details. If there is any change, in SCD’s there should be a manipulation in the process.

Slowly changing dimensions [SCD’s]

Screenshot_34

Type -1 →

  1. New record inserts
  2. Old record update/modifies

Screenshot_35Screenshot_36

Here in emp-history after type-1 performance is done the location of vinay is changed from the location HYD to USA (i.e., replaced).

Note: one customer only one location.

Type-2 (version mechanism)

  1. New records inserted with version ‘0’ (ZERO)
  2. Old records are inserted with increment version (i.e., either 1 or any)

 Screenshot_37Screenshot_38

Emp-History

Note: highest version indicates the current location of the customer.

Type-2 (status mechanism)

  1.  New records inserted with status = “current”
  2. Old record a) inserted with status = “current”

b) modifies earlier records status to “expired”

Screenshot_39Screenshot_40

Emp-History

Note: status=’current’ means the customer is in that location.

Type-2(data mechanism)

  1. New record inserted with start –date and end-date as ‘9999-12-31’
  2. Old records a)inserted with start–date and end-date as ‘9999-12-31’

b)modifies old record end-dt to new record start-date

Note: current location of customer identified with end-date=’9999-12-31’

Screenshot_41Screenshot_42

Type-3

For every customer previous/current locations maintained.

 Screenshot_43Screenshot_44

Note: if you want to increase the number of history rows those many columns should be added to the history tables, which is burden to the system.

In real time we use frequently type-2 (date mechanism) as history maintenance mechanism [see 66 page no for practical’s]

Maintenance cleanup task: It removes files left over from a maintenance plan.

Notify operator task: It sends an e-mail msg to any SQL server agent operator.

Rebuilt index task: It arranges or re-organizes data or the data index pages by rebuilding the indexes. This improves the performance of index scans and seeks.

Reorganize index task: It defragments and compacts clustered and non-clustered indexes on table and views.

Shrink database task: It reduces the disk space consumed by database and log files by removing empty data and log prices.

Update statics task: It updates statistics of the object if there are already collected.

Execute T-SQL statement task: 

It execute T-SQL server database, command, queries…etc.

Go to execute T-SQL taskàrt clickàeditàexecute

T-SQL statement taskàconnectionàSRC (take any connection)

T-SQL statement: Use DB-MSBI;

Delete from EMP_NEW1;

Implementation of slowly changing dimensions (SCD) (through wizards): 

In SSIS along with Type-1 and Type-2 we implement fixed attribute.

Fixed attribute

                In this situation the important business information field taken as “fixed”. If there is any change for values in the column we can take appropriate actions like failing or ignoring the operation.

 Screenshot_45

Navigation:

  1. Take OLEDB source.
  2. SCDàrt clickàeditànextà

Connection manager: DB_MSBI

Table or view: EMP_HIST

Specify the column EID as business key.

Set change type: fixed attribute

Click next.

Select fail the transformation if changes are detected in a fixed attributeànextànextàfinish.

SCD Type-1: (changing attribute)

Like above with only below two changes

  1. Set dimensions column: party loc.
  2. Set change type: changing attribute.

SCD Type-2: (status mechanism)

                Like above with only below two changes

Set change type: historical attribute

Select use a single column to show current and expired records.

Column to indicate current record: status

Value when current: current

Expired value: expired.

SCD Type-2: (date mechanism)

Select use start and end dates to identify current and expired records.

Start date column: start-date

End date column: end-date

Variable to set date values: system::container

Start time.

SCD Type-2: (date mechanism manually):

In control flow task →

 Screenshot_46

Screenshot_47

OLEDB source:

EMP_Daily

Lookup:

                Rt click → edit

General:

Cache mode: Full cache specify how to handle rows with no matching.

Entries: Redirect rows to no match output.

Connection:

  OLEDB connection manager: local host: DB-MSBI

Use table or view: EMP-HIST

Columns:

                                Connect EID, ENAME, ELOC to HIST_table for validation click okàok.

OLEDB command:

                                rt  click → Edit

Connection manager tab:

                                Connection manager

Component properties tab:

SQL command: update EMP-HIST set END-DT=? Where EID=? And END-DT is null.

Column names tab:

                                Connect EID to parameter-1, ST-DT to parameter-0 click ok → ok

DF-insert:

                     OLEDB source: EMP-Daily

                     Lookup: rt clickàedit

                     General:

                     Cache mode:

                     Screenshot_48

Take OLEDB Destination as EMP-HIST and connect the fields except END-DT.

Data profiling task:

  1. Newly added in 2008.
  2. This task is helpful to profile the data before proceeding further.
  3. Generally it profiles the below information.

Go to data profiling taskàrt click → edit →

Profile request →

Profile type: candidate key profile request

Column length distribution profile request

Column null ratio profile request

Column pattern profile request

Column statistics profile request

Column value distribution profile request

Functional dependency profile request

Value inclusion profile request

Create ADO.Net connection for the source table

1.  SSIS menu → New connection → ADO.Net → Add → Data connection: MSBI-DB

2.  Data profile task on control flow

Rt click → edit

General:                                                                                                                   

Destination type: file connection.

Destination: New connection → create file → c:\party-profile-request.

Click Quick profile →

ADO.Net connection: MSBI-DB

Table or view: party

Compute:

Select the required options

Click ok → ok.

  1. Execute data profile task and observe the file.
  2. Start menu→program→SS 2008→Integration service →data profile viewers→open→specify file path (c:\party-profile-result) and monitor the analysis by selecting each attribute.

Note: In other ETL tools for profiling we need to go for 3rd party tools.

Enroll for Instructor Led Live Online MSBI Training


0 Responses on Slowly Changing Dimensions SSIS Step by Step"

Leave a Message

Your email address will not be published. Required fields are marked *

Copy Rights Reserved © Mindmajix.com All rights reserved. Disclaimer.
Course Adviser

Fill your details, course adviser will reach you.